MCP Explained: The Access-Control Architecture Most Guides Miss

MCP (Model Context Protocol) explained: five primitives, STDIO vs Streamable HTTP, OAuth 2.1, and the access-control design most AI guides miss.

Updated 15 min read
MCP Guide

MCP (Model Context Protocol) is an open standard that defines how AI applications connect to external tools, data sources, and services through a standardized interface. Released by Anthropic on November 25, 2024 and donated to the Linux Foundation in December 2025, it solves the N×M integration problem: before MCP, connecting five AI hosts to fifty tools required 250 bespoke integrations. MCP collapses that to N+M.

This guide covers the full protocol: its three-role architecture, five primitives, transport choices, OAuth 2.1 authorization, ecosystem, and the framing mistake that leads most implementations astray.

Key Takeaways

  • MCP defines what an AI can do, not just what it can reach. It's an access-control layer first, a capability-expansion layer second.
  • Every MCP server exposes up to five building blocks: Tools (writable), Resources (read-only), Prompts (templates), Elicitation (user input mid-flow), and Sampling (server-side LLM calls).
  • The transport decision is simple: STDIO for local and single-user deployments; Streamable HTTP for remote, multi-tenant, and SaaS. HTTP+SSE is deprecated as of the November 2025 spec.
  • The MCP Registry grew from ~400 servers at its September 2025 launch to 2,000+ by November 2025 (407% growth). Community directories list 4,000+ by early 2026.

What Is MCP?

MCP (Model Context Protocol) is a JSON-RPC 2.0-based protocol that standardizes communication between AI applications and the external systems they access. It defines what the AI is allowed to do and how to do it: the security interface, not just the data pipe.

Two Anthropic engineers, David Soria Parra and Justin Spahr-Summers, created the protocol as a side project in mid-2024. Anthropic open-sourced MCP in November 2024 with Claude Desktop as the first client and pre-built servers for GitHub, Slack, Google Drive, and Postgres.

Why MCP Matters in 2026

The N×M problem is the reason. Before MCP, every combination of AI host and external tool required its own bespoke integration code. Five hosts, fifty tools: 250 separate integrations, each brittle and non-transferable between systems.

MCP collapses that to N+M. Each AI host writes one MCP client. Each service ships one MCP server.

Any client can talk to any server.

Alex Albert of Anthropic called 2025 "the year of MCP and tool use," with 2026 moving toward computer environments and filesystems. The MCP specification is now governed by the Agentic AI Foundation (AAIF) under the Linux Foundation, with eight founding Platinum members: Anthropic, OpenAI, Google, Microsoft, AWS, Block, Bloomberg, and Cloudflare.

Google Trends for "model context protocol" peaked at 100/100 during the week of March 1-7, 2026. By May 2026, the index settled at 11-20, reflecting stable adoption rather than a hype spike. The primary keyword "mcp" draws 90,500 searches per month in the US alone, with low competition.

How MCP Works: The Three-Role Architecture

The single most common error in MCP coverage is conflating the Host and the Client. They're different entities with distinct responsibilities.

Role

What it is

Real examples

Host

The AI application the user interacts with. Manages orchestration, consent UI, and model access. One process.

Claude Desktop, Cursor, VS Code, ChatGPT

Client

A protocol-level component inside the Host. Maintains a strict 1-to-1 connection with exactly one server. Three servers = three Client objects.

VS Code's MCP client for the GitHub server

Server

A lightweight program that exposes tools, resources, and prompts. Has no knowledge of which LLM is on the other end. Can be ~40 lines of Python.

GitHub MCP server, Stripe MCP, Supabase

When a user opens a session, the Host initiates connections via Client objects. Each Client sends tools/list, resources/list, and prompts/list to discover what the server offers. The model query gets enriched with resource context.

When the model decides to use a tool, it sends tools/call; the Client executes via the server and incorporates the response.

The wire protocol is JSON-RPC 2.0. Every message is a Request (has an ID, expects a response), Response (references the ID, has a result or error), or Notification (no ID, fire-and-forget).

The Bidirectional Surprise

MCP is bidirectional: the Client sends requests to the Server, and the Server sends requests back to the Client. The reversal surprises developers accustomed to standard HTTP APIs. STDIO works as a long-lived duplex stream rather than discrete request-response cycles.

The Five MCP Primitives

Every MCP server exposes up to five building blocks. Most production servers implement two or three. Understanding which primitive does what determines whether your implementation is actually secure.

1. Tools: Writable, Executable Functions

Tools are what the LLM actively invokes. They can write to databases, call external APIs, modify files. Each tool carries a name, a description (the only information the LLM sees when deciding whether to call it), an inputSchema in JSON Schema, and optional annotations.

The annotations are the security levers: readOnlyHint, destructiveHint, and idempotentHint tell the Host what consent UI to display before execution. The 2025 spec added a structuredContent field on responses for agent-to-agent payloads.

User consent is required before any tool with side effects executes. This is not optional in compliant implementations.

2. Resources: Read-Only Context

Resources are passive data the application (not the model) surfaces as context. Each resource has a uri (e.g., file:///etc/hosts, postgres://db/orders/1274), a name, a description, and a mimeType.

Resources are returned by resources/read as text or base64 blobs. Think of them as GET endpoints in a REST API: the server controls what the model sees, but the model cannot mutate anything through the resource interface. Clients can subscribe for real-time change notifications.

This is the distinction most guides skip. Tools are writable; Resources are not. Conflating the two breaks the security model.

3. Prompts: User-Initiated Templates

Prompts are pre-built instruction templates the user (not the model) chooses to invoke. They guide the model through repeatable workflows. A "Plan a vacation" prompt might load travel tools plus calendar resources in a structured interaction pattern.

Prompts are the least-implemented primitive in production servers. They're the right interface for building reliable, repeatable multi-step workflows without hard-coding instructions into the Host.

4. Elicitation: Servers Request User Input Mid-Flow

New in the November 2025 spec. Servers can request specific information from users during an ongoing interaction. The Host shows the request; the user responds; the server continues.

URL Mode Elicitation handles OAuth and credential acquisition: the server sends users to a browser, receives tokens directly, and the MCP client never sees the credentials. This is the correct pattern for PCI-compliant payment flows.

5. Sampling: Server-Side LLM Calls

The primitive that makes developers re-read the spec. The Server sends sampling/createMessage back to the Client, which runs a completion on the connected LLM (with user consent) and returns the result.

Sampling lets agentic servers reason without bundling their own LLM API key. The Client must show users exactly what's being requested and allow them to edit or reject. No silent completions.

The Real Design Intent: Access Control, Not Capability Expansion

The dominant public narrative frames MCP as "giving AI more power." That framing is backwards.

u/chrisza4 in r/programming (Jan 2026) put the practitioner consensus directly:

"It is much better to call psql via MCP servers instead of giving LLM direct access to DB. So, we can control what LLM can do or cannot. I don't want my LLM to fix the bug by dropping the whole table, even in my dev environment. Unlike many hypes, the point of MCP is not about making AI more powerful but it is to actually make them as powerful as we want, and not more than that."

You don't give an LLM raw database access. You build an MCP server that exposes exactly the operations you intend it to perform. The server is the access-control boundary.

Vercel's internal agent work (May 2026) reinforces this from production data: they removed 80% of their agent's tools to restore accuracy. Agentic requests now carry 59% of all token volume, up from 32% six months prior, with teams averaging 35 models at scale. Fewer tools, correctly scoped, outperform a broad tool surface.

Security Considerations

MCP is a communication protocol. Security depends entirely on how hosts and servers implement it.

Five rules compliant Hosts must enforce:

  1. User consent per tool call for destructive or write operations
  2. Scope operations to declared roots so filesystem servers only access approved paths
  3. Treat tool output as untrusted and sanitize before acting on server responses
  4. Require consent for sampling to prevent silent LLM calls
  5. OAuth scopes limit blast radius so compromised tokens only access scoped resources

Supply chain risk is real. Malicious MCP servers can silently exfiltrate data or execute destructive actions. The MCP Registry uses namespace authentication, but depends on upstream package registries (npm, PyPI, Docker Hub) for vulnerability scanning.

Prompt injection via instructions embedded in resources is documented. Container isolation is recommended for production MCP infrastructure.

Community criticism persists on auth architecture. u/apnorton in r/programming (Jun 2025):

"The core issue with MCPs is, essentially, that there's no sufficient trust boundary anywhere. It's like the people who designed it threw out the past 40 years of software engineering practice and decided to yolo their communication design."

That criticism was written before the November 2025 spec formalized OAuth 2.1. It still applies to older servers that haven't upgraded. If you're maintaining a server built against the original spec, check the current specification for the auth changes.

Transport and Authorization

STDIO vs. Streamable HTTP

The decision rule is a single question: is this server local or remote?

Transport

Use case

Status

STDIO

Local desktop tools, CLI agents, Claude Code plugins

Stable

HTTP + SSE

Older remote servers

Deprecated (Nov 2025)

Streamable HTTP

Remote, multi-tenant, serverless

Current standard

STDIO spawns the server as a subprocess and communicates over stdin/stdout with newline-delimited JSON-RPC. It's the right choice when you control both ends and security is established by OS identity.

Streamable HTTP uses a single endpoint for all communication. The Client POSTs JSON-RPC; the server responds with JSON for short operations, or upgrades to SSE stream for long tool calls. Streamable HTTP supports resumption via Last-Event-ID and survives serverless deployments where the older HTTP+SSE pattern failed.

Local models face a practical constraint: each MCP server's tool definitions consume 600-800 tokens in the system prompt.

u/claythearc in r/LocalLLaMA (Jul 2025) documented the impact: models with effective context under 5K tokens can exhaust their working memory before any user message, just from registering a few servers. Keep tool registrations lean on constrained local deployments.

OAuth 2.1 Authorization

For STDIO, no separate auth is needed: OS identity handles the user. For remote transports, the November 2025 spec formalizes OAuth 2.1 with PKCE as the standard. WorkOS's MCP auth guide covers the full implementation.

The authorization component stack:

  • Dynamic Client Registration (RFC 7591): clients register themselves without human pre-provisioning
  • Client ID Metadata Documents (SEP-991): URL-based client self-registration replaces manual per-user registration flows
  • Authorization code flow with PKCE: browser consent, code to access token, Bearer header on every request
  • Cross App Access (SEP-990): sign in once, get access to every authorized MCP server the user has approved
  • Fine-grained scoping: e.g., read:report, tools:generate_summary

Hard rule: do not use static API keys for public MCP servers. Static API keys leak, can't be revoked per user, and tie server access to the app rather than the human.

For any public or multi-user remote server, implement OAuth 2.1 with PKCE. The upfront cost is a few hours; the exposure from a leaked static key is unbounded.

The MCP Ecosystem in 2026

MCP governance moved to the AAIF in December 2025. On r/ClaudeAI, a significant minority interpreted the open-governance donation as Anthropic deprioritizing the protocol. That reading is wrong: open governance eliminates single-vendor spec risk and lock-in concerns.

Adoption Timeline

Date

Milestone

Nov 2024

Anthropic open-sources MCP; Claude Desktop; GitHub, Slack, Google Drive, Postgres servers

Mar 2025

Microsoft Copilot Studio MCP announced; OpenAI adds MCP to Agents SDK

May 2025

VS Code MCP support generally available

Jun 2025

AWS MCP on Amazon Bedrock

Sep 2025

MCP Registry launches (~400 servers)

Nov 2025

First anniversary spec release (Streamable HTTP, OAuth 2.1, Elicitation, Task lifecycle); Registry surpasses 2,000 servers (407% growth)

Dec 2025

Donated to Linux Foundation; AAIF formed; Google MCP servers announced for Google services

Jan 2026

Anthropic embeds Slack, Figma, Asana, Box, Canva, Clay, Hex, Monday.com, Amplitude in Claude via MCP Apps

Apr 2026

Google-managed MCP servers available to all developers

May 2026

Atlassian ships Cursor-in-Jira; Stripe activates machine payments via agent prompt

Notable Production Servers

Server

Provider

Key capabilities

GitHub

GitHub (official)

Repos, PRs, issues, code search

Stripe

Stripe (official)

Payments, invoices, customers, subscriptions, refunds, and more

Notion

Notion (official)

Pages, databases, search

Linear

Linear (official)

Issues, projects, teams

Figma

Figma (official)

Design files, components, comments

Slack

Anthropic pre-built

Messages, channels, workspace data

Google Drive / Gmail / Calendar

Google-managed

File access, email, events

Atlassian (Jira + Confluence)

Atlassian

Project and knowledge management

Cloudflare

Cloudflare

2,500+ API endpoints compressed to ~1K tokens

Supabase

Supabase (official)

Database, auth, storage

Stripe's MCP server covers the widest production surface among official providers: payments, invoices, customers, subscriptions, and machine payments live via a single agent prompt as of May 2026. Cloudflare compresses 2,500+ API endpoints into roughly 1K tokens using Code Mode, giving it the broadest coverage among provider servers.

Building Your First MCP Server

The minimum viable server is around 40 lines of Python. Here is a weather server using the FastMCP pattern from the official Python SDK:

Shell
# Install uv and create project
curl -LsSf https://astral.sh/uv/install.sh | sh
uv init weather && cd weather
uv venv && source .venv/bin/activate
uv add "mcp[cli]" httpx
Python
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("weather")

@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
    """Get weather forecast for a location.
    Use when user asks about weather at specific coordinates.
    Do NOT use for city names; use get_forecast_by_city instead."""
    pass  # your implementation here

if __name__ == "__main__":
    mcp.run(transport='stdio')

Add this block to claude_desktop_config.json:

Text
{
  "mcpServers": {
    "weather": {
      "command": "uv",
      "args": ["--directory", "/path/to/weather", "run", "weather.py"]
    }
  }
}

For remote deployment, the TypeScript SDK on Cloudflare Workers gives you 100K requests per day on the free plan. Run wrangler deploy after installing @modelcontextprotocol/sdk.

Simon Willison (@simonw) added Playwright browser automation to Claude Code with a single command (claude mcp add playwright npx '@playwright/mcp@latest'). For well-packaged servers, onboarding is that fast.

For a step-by-step walkthrough of a production-ready server, see our MCP server guide. If you're new to the agent architecture that makes MCP useful, our explainer on AI agents covers the foundational concepts.

Common MCP Mistakes

Treating Tools and Resources as Interchangeable

Tools write; Resources read. Implementing a database query as a Tool instead of a Resource gives the LLM a write handle when you only intended a read handle. Start every primitive decision with the question: should the model be able to change anything through this interface?

Using Static API Keys for Remote Servers

Static API keys can't be revoked per user and tie access to the app identity rather than the human. For any public or multi-user remote server, OAuth 2.1 with PKCE is the correct auth pattern; the exposure from a leaked static key has no ceiling.

Registering Too Many Servers on Small Models

Each server's tool definitions consume 600-800 tokens in the system prompt before any conversation begins. A local model with a 4K effective context window can exhaust its working memory before the user types anything. Start with one or two servers and add more only when the model handles the tool surface without degrading.

Deploying New Remote Servers with HTTP+SSE

HTTP+SSE was deprecated in the November 2025 spec. Any new remote server should use Streamable HTTP. Existing servers built on HTTP+SSE should migrate: the pattern fails on serverless deployments and lacks the resumption support that Last-Event-ID provides in Streamable HTTP.

Writing Tool Descriptions for Humans, Not Models

The description field is the only information the LLM uses when deciding whether to call a tool. Write it like briefing a capable intern who cannot read your code: what the tool does, when to use it, and when not to. Vague descriptions produce wrong calls; precise descriptions with explicit negative examples produce accurate ones.

MCP vs. Alternatives

MCP won the standards race without a real contest. No alternative reached comparable ecosystem traction before the AAIF formation in December 2025.

Dimension

MCP

REST API

Function Calling (OpenAI)

Tool discovery

Automatic (tools/list)

Manual documentation

Custom schemas per model

Multi-server support

Native

Manual routing

Not inherent

Portability

Any MCP-compatible host

N/A

Tied to one provider

Streaming

Native (Streamable HTTP)

Varies

Varies

Auth

OAuth 2.1 / OS identity

Varies

API key / OAuth

Governance

Linux Foundation / AAIF

N/A

Vendor (OpenAI)

Use MCP for AI-first workflows where multiple tools and hosts need to interoperate. Use REST APIs for traditional app-to-app integrations and webhooks where AI is not in the loop. Use function calling when you're locked to one model provider and portability is not a requirement.

UTCP (Universal Tool Calling Protocol) surfaced in r/LocalLLaMA as a stateless, HTTP-native alternative requiring no persistent server process. Practitioners found it more practical for simpler, single-tool use cases. Whether it gains broader ecosystem support by late 2026 is an open question.

Harrison Chase (@hwchase17) (Mar 2025) raised the right structural question early: "If MCP is Zapier, won't the value accrue to the client (not the integrations)?" Anyone building an MCP-dependent business should answer this before committing. The protocol commoditizes server integration; the host and the application layer capture the durable value.

Tags

Frequently Asked Questions

Related Articles

Artificial intelligence statistics 2026

65 Artificial Intelligence Statistics for 2026

The most current artificial intelligence statistics for 2026, covering market size, investment, adoption, workforce impact, productivity, healthcare, and public trust. 65 sourced data points from McKinsey, OECD, Goldman Sachs, Pew Research, and more.