MCP Explained: The Access-Control Architecture Most Guides Miss
MCP (Model Context Protocol) explained: five primitives, STDIO vs Streamable HTTP, OAuth 2.1, and the access-control design most AI guides miss.

MCP (Model Context Protocol) explained: five primitives, STDIO vs Streamable HTTP, OAuth 2.1, and the access-control design most AI guides miss.

MCP (Model Context Protocol) is an open standard that defines how AI applications connect to external tools, data sources, and services through a standardized interface. Released by Anthropic on November 25, 2024 and donated to the Linux Foundation in December 2025, it solves the N×M integration problem: before MCP, connecting five AI hosts to fifty tools required 250 bespoke integrations. MCP collapses that to N+M.
This guide covers the full protocol: its three-role architecture, five primitives, transport choices, OAuth 2.1 authorization, ecosystem, and the framing mistake that leads most implementations astray.
MCP (Model Context Protocol) is a JSON-RPC 2.0-based protocol that standardizes communication between AI applications and the external systems they access. It defines what the AI is allowed to do and how to do it: the security interface, not just the data pipe.
Two Anthropic engineers, David Soria Parra and Justin Spahr-Summers, created the protocol as a side project in mid-2024. Anthropic open-sourced MCP in November 2024 with Claude Desktop as the first client and pre-built servers for GitHub, Slack, Google Drive, and Postgres.
The N×M problem is the reason. Before MCP, every combination of AI host and external tool required its own bespoke integration code. Five hosts, fifty tools: 250 separate integrations, each brittle and non-transferable between systems.
MCP collapses that to N+M. Each AI host writes one MCP client. Each service ships one MCP server.
Any client can talk to any server.
Alex Albert of Anthropic called 2025 "the year of MCP and tool use," with 2026 moving toward computer environments and filesystems. The MCP specification is now governed by the Agentic AI Foundation (AAIF) under the Linux Foundation, with eight founding Platinum members: Anthropic, OpenAI, Google, Microsoft, AWS, Block, Bloomberg, and Cloudflare.
Google Trends for "model context protocol" peaked at 100/100 during the week of March 1-7, 2026. By May 2026, the index settled at 11-20, reflecting stable adoption rather than a hype spike. The primary keyword "mcp" draws 90,500 searches per month in the US alone, with low competition.
The single most common error in MCP coverage is conflating the Host and the Client. They're different entities with distinct responsibilities.
Role | What it is | Real examples |
|---|---|---|
Host | The AI application the user interacts with. Manages orchestration, consent UI, and model access. One process. | Claude Desktop, Cursor, VS Code, ChatGPT |
Client | A protocol-level component inside the Host. Maintains a strict 1-to-1 connection with exactly one server. Three servers = three Client objects. | VS Code's MCP client for the GitHub server |
Server | A lightweight program that exposes tools, resources, and prompts. Has no knowledge of which LLM is on the other end. Can be ~40 lines of Python. | GitHub MCP server, Stripe MCP, Supabase |
When a user opens a session, the Host initiates connections via Client objects. Each Client sends tools/list, resources/list, and prompts/list to discover what the server offers. The model query gets enriched with resource context.
When the model decides to use a tool, it sends tools/call; the Client executes via the server and incorporates the response.
The wire protocol is JSON-RPC 2.0. Every message is a Request (has an ID, expects a response), Response (references the ID, has a result or error), or Notification (no ID, fire-and-forget).
MCP is bidirectional: the Client sends requests to the Server, and the Server sends requests back to the Client. The reversal surprises developers accustomed to standard HTTP APIs. STDIO works as a long-lived duplex stream rather than discrete request-response cycles.
Every MCP server exposes up to five building blocks. Most production servers implement two or three. Understanding which primitive does what determines whether your implementation is actually secure.
Tools are what the LLM actively invokes. They can write to databases, call external APIs, modify files. Each tool carries a name, a description (the only information the LLM sees when deciding whether to call it), an inputSchema in JSON Schema, and optional annotations.
The annotations are the security levers: readOnlyHint, destructiveHint, and idempotentHint tell the Host what consent UI to display before execution. The 2025 spec added a structuredContent field on responses for agent-to-agent payloads.
User consent is required before any tool with side effects executes. This is not optional in compliant implementations.
Resources are passive data the application (not the model) surfaces as context. Each resource has a uri (e.g., file:///etc/hosts, postgres://db/orders/1274), a name, a description, and a mimeType.
Resources are returned by resources/read as text or base64 blobs. Think of them as GET endpoints in a REST API: the server controls what the model sees, but the model cannot mutate anything through the resource interface. Clients can subscribe for real-time change notifications.
This is the distinction most guides skip. Tools are writable; Resources are not. Conflating the two breaks the security model.
Prompts are pre-built instruction templates the user (not the model) chooses to invoke. They guide the model through repeatable workflows. A "Plan a vacation" prompt might load travel tools plus calendar resources in a structured interaction pattern.
Prompts are the least-implemented primitive in production servers. They're the right interface for building reliable, repeatable multi-step workflows without hard-coding instructions into the Host.
New in the November 2025 spec. Servers can request specific information from users during an ongoing interaction. The Host shows the request; the user responds; the server continues.
URL Mode Elicitation handles OAuth and credential acquisition: the server sends users to a browser, receives tokens directly, and the MCP client never sees the credentials. This is the correct pattern for PCI-compliant payment flows.
The primitive that makes developers re-read the spec. The Server sends sampling/createMessage back to the Client, which runs a completion on the connected LLM (with user consent) and returns the result.
Sampling lets agentic servers reason without bundling their own LLM API key. The Client must show users exactly what's being requested and allow them to edit or reject. No silent completions.
The dominant public narrative frames MCP as "giving AI more power." That framing is backwards.
u/chrisza4 in r/programming (Jan 2026) put the practitioner consensus directly:
"It is much better to call psql via MCP servers instead of giving LLM direct access to DB. So, we can control what LLM can do or cannot. I don't want my LLM to fix the bug by dropping the whole table, even in my dev environment. Unlike many hypes, the point of MCP is not about making AI more powerful but it is to actually make them as powerful as we want, and not more than that."
You don't give an LLM raw database access. You build an MCP server that exposes exactly the operations you intend it to perform. The server is the access-control boundary.
Vercel's internal agent work (May 2026) reinforces this from production data: they removed 80% of their agent's tools to restore accuracy. Agentic requests now carry 59% of all token volume, up from 32% six months prior, with teams averaging 35 models at scale. Fewer tools, correctly scoped, outperform a broad tool surface.
MCP is a communication protocol. Security depends entirely on how hosts and servers implement it.
Five rules compliant Hosts must enforce:
Supply chain risk is real. Malicious MCP servers can silently exfiltrate data or execute destructive actions. The MCP Registry uses namespace authentication, but depends on upstream package registries (npm, PyPI, Docker Hub) for vulnerability scanning.
Prompt injection via instructions embedded in resources is documented. Container isolation is recommended for production MCP infrastructure.
Community criticism persists on auth architecture. u/apnorton in r/programming (Jun 2025):
"The core issue with MCPs is, essentially, that there's no sufficient trust boundary anywhere. It's like the people who designed it threw out the past 40 years of software engineering practice and decided to yolo their communication design."
That criticism was written before the November 2025 spec formalized OAuth 2.1. It still applies to older servers that haven't upgraded. If you're maintaining a server built against the original spec, check the current specification for the auth changes.
The decision rule is a single question: is this server local or remote?
Transport | Use case | Status |
|---|---|---|
STDIO | Local desktop tools, CLI agents, Claude Code plugins | Stable |
HTTP + SSE | Older remote servers | Deprecated (Nov 2025) |
Streamable HTTP | Remote, multi-tenant, serverless | Current standard |
STDIO spawns the server as a subprocess and communicates over stdin/stdout with newline-delimited JSON-RPC. It's the right choice when you control both ends and security is established by OS identity.
Streamable HTTP uses a single endpoint for all communication. The Client POSTs JSON-RPC; the server responds with JSON for short operations, or upgrades to SSE stream for long tool calls. Streamable HTTP supports resumption via Last-Event-ID and survives serverless deployments where the older HTTP+SSE pattern failed.
Local models face a practical constraint: each MCP server's tool definitions consume 600-800 tokens in the system prompt.
u/claythearc in r/LocalLLaMA (Jul 2025) documented the impact: models with effective context under 5K tokens can exhaust their working memory before any user message, just from registering a few servers. Keep tool registrations lean on constrained local deployments.
For STDIO, no separate auth is needed: OS identity handles the user. For remote transports, the November 2025 spec formalizes OAuth 2.1 with PKCE as the standard. WorkOS's MCP auth guide covers the full implementation.
The authorization component stack:
read:report, tools:generate_summaryHard rule: do not use static API keys for public MCP servers. Static API keys leak, can't be revoked per user, and tie server access to the app rather than the human.
For any public or multi-user remote server, implement OAuth 2.1 with PKCE. The upfront cost is a few hours; the exposure from a leaked static key is unbounded.
MCP governance moved to the AAIF in December 2025. On r/ClaudeAI, a significant minority interpreted the open-governance donation as Anthropic deprioritizing the protocol. That reading is wrong: open governance eliminates single-vendor spec risk and lock-in concerns.
Date | Milestone |
|---|---|
Nov 2024 | Anthropic open-sources MCP; Claude Desktop; GitHub, Slack, Google Drive, Postgres servers |
Mar 2025 | Microsoft Copilot Studio MCP announced; OpenAI adds MCP to Agents SDK |
May 2025 | VS Code MCP support generally available |
Jun 2025 | AWS MCP on Amazon Bedrock |
Sep 2025 | MCP Registry launches (~400 servers) |
Nov 2025 | First anniversary spec release (Streamable HTTP, OAuth 2.1, Elicitation, Task lifecycle); Registry surpasses 2,000 servers (407% growth) |
Dec 2025 | Donated to Linux Foundation; AAIF formed; Google MCP servers announced for Google services |
Jan 2026 | Anthropic embeds Slack, Figma, Asana, Box, Canva, Clay, Hex, Monday.com, Amplitude in Claude via MCP Apps |
Apr 2026 | Google-managed MCP servers available to all developers |
May 2026 | Atlassian ships Cursor-in-Jira; Stripe activates machine payments via agent prompt |
Server | Provider | Key capabilities |
|---|---|---|
GitHub (official) | Repos, PRs, issues, code search | |
Stripe (official) | Payments, invoices, customers, subscriptions, refunds, and more | |
Notion | Notion (official) | Pages, databases, search |
Linear | Linear (official) | Issues, projects, teams |
Figma | Figma (official) | Design files, components, comments |
Slack | Anthropic pre-built | Messages, channels, workspace data |
Google Drive / Gmail / Calendar | File access, email, events | |
Atlassian (Jira + Confluence) | Atlassian | Project and knowledge management |
Cloudflare | 2,500+ API endpoints compressed to ~1K tokens | |
Supabase | Supabase (official) | Database, auth, storage |
Stripe's MCP server covers the widest production surface among official providers: payments, invoices, customers, subscriptions, and machine payments live via a single agent prompt as of May 2026. Cloudflare compresses 2,500+ API endpoints into roughly 1K tokens using Code Mode, giving it the broadest coverage among provider servers.
The minimum viable server is around 40 lines of Python. Here is a weather server using the FastMCP pattern from the official Python SDK:
# Install uv and create project
curl -LsSf https://astral.sh/uv/install.sh | sh
uv init weather && cd weather
uv venv && source .venv/bin/activate
uv add "mcp[cli]" httpxfrom mcp.server.fastmcp import FastMCP
mcp = FastMCP("weather")
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get weather forecast for a location.
Use when user asks about weather at specific coordinates.
Do NOT use for city names; use get_forecast_by_city instead."""
pass # your implementation here
if __name__ == "__main__":
mcp.run(transport='stdio')Add this block to claude_desktop_config.json:
{
"mcpServers": {
"weather": {
"command": "uv",
"args": ["--directory", "/path/to/weather", "run", "weather.py"]
}
}
}For remote deployment, the TypeScript SDK on Cloudflare Workers gives you 100K requests per day on the free plan. Run wrangler deploy after installing @modelcontextprotocol/sdk.
Simon Willison (@simonw) added Playwright browser automation to Claude Code with a single command (claude mcp add playwright npx '@playwright/mcp@latest'). For well-packaged servers, onboarding is that fast.
For a step-by-step walkthrough of a production-ready server, see our MCP server guide. If you're new to the agent architecture that makes MCP useful, our explainer on AI agents covers the foundational concepts.
Tools write; Resources read. Implementing a database query as a Tool instead of a Resource gives the LLM a write handle when you only intended a read handle. Start every primitive decision with the question: should the model be able to change anything through this interface?
Static API keys can't be revoked per user and tie access to the app identity rather than the human. For any public or multi-user remote server, OAuth 2.1 with PKCE is the correct auth pattern; the exposure from a leaked static key has no ceiling.
Each server's tool definitions consume 600-800 tokens in the system prompt before any conversation begins. A local model with a 4K effective context window can exhaust its working memory before the user types anything. Start with one or two servers and add more only when the model handles the tool surface without degrading.
HTTP+SSE was deprecated in the November 2025 spec. Any new remote server should use Streamable HTTP. Existing servers built on HTTP+SSE should migrate: the pattern fails on serverless deployments and lacks the resumption support that Last-Event-ID provides in Streamable HTTP.
The description field is the only information the LLM uses when deciding whether to call a tool. Write it like briefing a capable intern who cannot read your code: what the tool does, when to use it, and when not to. Vague descriptions produce wrong calls; precise descriptions with explicit negative examples produce accurate ones.
MCP won the standards race without a real contest. No alternative reached comparable ecosystem traction before the AAIF formation in December 2025.
Dimension | MCP | REST API | Function Calling (OpenAI) |
|---|---|---|---|
Tool discovery | Automatic ( | Manual documentation | Custom schemas per model |
Multi-server support | Native | Manual routing | Not inherent |
Portability | Any MCP-compatible host | N/A | Tied to one provider |
Streaming | Native (Streamable HTTP) | Varies | Varies |
Auth | OAuth 2.1 / OS identity | Varies | API key / OAuth |
Governance | Linux Foundation / AAIF | N/A | Vendor (OpenAI) |
Use MCP for AI-first workflows where multiple tools and hosts need to interoperate. Use REST APIs for traditional app-to-app integrations and webhooks where AI is not in the loop. Use function calling when you're locked to one model provider and portability is not a requirement.
UTCP (Universal Tool Calling Protocol) surfaced in r/LocalLLaMA as a stateless, HTTP-native alternative requiring no persistent server process. Practitioners found it more practical for simpler, single-tool use cases. Whether it gains broader ecosystem support by late 2026 is an open question.
Harrison Chase (@hwchase17) (Mar 2025) raised the right structural question early: "If MCP is Zapier, won't the value accrue to the client (not the integrations)?" Anyone building an MCP-dependent business should answer this before committing. The protocol commoditizes server integration; the host and the application layer capture the durable value.

Andrej Karpathy co-founded OpenAI, led Tesla's Autopilot team, coined vibe coding, and founded Eureka Labs. A profile of the AI researcher who made deep learning accessible to millions.

The most current artificial intelligence statistics for 2026, covering market size, investment, adoption, workforce impact, productivity, healthcare, and public trust. 65 sourced data points from McKinsey, OECD, Goldman Sachs, Pew Research, and more.