In November 2024, Anthropic published a technical spec and two SDKs on GitHub. Sixteen months later, the Model Context Protocol (MCP) has been adopted by OpenAI, Google, Microsoft, and AWS, crossed 97 million monthly SDK downloads, and been donated to the Linux Foundation's Agentic AI Foundation. No other infrastructure protocol in AI history has consolidated a fragmented ecosystem this fast.
MCP is now the plumbing underneath the agentic AI economy. If you are building AI agents in 2026, understanding MCP is not optional. This guide covers what MCP is, what changed this year, and how to manage MCP infrastructure at production scale.
What problem MCP solves
Before MCP, every AI application built its own integrations from scratch. A coding assistant that needed to read your Git repo, query your Jira tickets, and search your Confluence docs required three bespoke connectors. Each was brittle, each maintained separately, each incompatible with any other tool.
The result was a combinatorial explosion: M models times N tools equals M times N custom integrations. For enterprise deployments with dozens of systems and multiple AI providers, this was untenable.
MCP solves this with a standardized client server protocol built on JSON RPC. One connector format works with every AI client that speaks MCP. Instead of M times N integrations, you get M + N: each client implements the MCP client protocol once, and each tool implements the MCP server protocol once. Every client works with every server automatically.
How MCP works
MCP defines a three layer architecture:
| Layer | Role | Example |
|---|---|---|
| Host | The application users interact with | Claude Desktop, VS Code, a custom chatbot |
| Client | Component inside the host that manages MCP connections | The MCP client library in your app |
| Server | Lightweight program exposing capabilities | A GitHub MCP server, a database query server |
Each MCP server exposes capabilities through three primitives:
- Tools: Executable functions the AI can invoke (e.g.,
create_issue,run_query) - Resources: Data the AI can read (e.g., file contents, database schemas)
- Prompts: Reusable prompt templates the server provides
When an AI agent needs to take an action, it discovers available tools from connected MCP servers, selects the right tool, constructs the arguments, and the MCP client executes the call. The server runs the operation and returns results through the same protocol.
What changed in 2026
MCP Apps: Tools that return UI
The biggest MCP extension of 2026 is MCP Apps. Tools can now return interactive UI components that render directly in the conversation: dashboards, forms, visualizations, and multi step workflows.
Before MCP Apps, a sales analytics tool would return a wall of text with numbers. Now it returns an interactive dashboard where users filter by region, drill into accounts, and export reports without leaving the conversation. ChatGPT, Claude, VS Code, and Goose all ship support for MCP Apps.
MCP v2 Beta: Breaking changes for multi agent systems
The @ai-sdk/mcp v2.0.0-beta.3 landed in March 2026 with breaking changes to imports and type names. The redesign signals that MCP is no longer an experiment. The protocol is being hardened for the multi agent production systems that are shipping in 2026.
Key changes include stricter auth conformance, improved error normalization in the OpenAI Agents SDK integration, and a structured Task API in Google ADK for agent to agent delegation.
The 72% context window problem
A widely cited benchmark shows that 72% of an agent's context window gets consumed by MCP tool schemas alone when connecting to multiple servers. With 10,000+ public servers in the ecosystem, teams are hitting this ceiling hard.
The solution is selective tool exposure: do not dump every available tool into the agent's context. Instead, whitelist only the tools each agent actually needs. This is exactly what Requesty's MCP Gateway does at the infrastructure level.
Governance and security
Microsoft released the Agent Governance Toolkit (AGT), an open source runtime governance layer for MCP tool execution. The core problem: MCP standardizes the execution surface without defining how that surface should be governed. Tool definitions are fed directly to the model, tool servers can be hosted by anyone, and there is no built in checkpoint for policy evaluation before a call is executed.
For enterprise deployments, this means you need a gateway layer that can answer: is this agent allowed to invoke this tool, with these arguments, at this time?
Managing MCP at scale
Running a few MCP servers locally is straightforward. Running dozens of MCP servers across an organization with multiple AI tools, multiple teams, and compliance requirements is an infrastructure problem.
The challenges at scale:
- Authentication sprawl. Each MCP server has its own auth. GitHub needs a PAT, Notion needs an integration token, Linear needs an API key. Managing these across an organization is a credential management nightmare.
- Tool discovery overload. With thousands of available tools, agents suffer from context window pollution and decision paralysis.
- No centralized observability. Each AI tool (Claude Code, Cursor, VS Code) has its own MCP connections. There is no single pane of glass showing which tools are being used, how often, and how fast.
- Security gaps. Without a gateway, every AI tool connects directly to every MCP server. There is no policy enforcement layer between the agent and the tool.
How Requesty's MCP Gateway solves this
Requesty's MCP Gateway sits between your AI tools and your MCP servers, providing centralized management, security, and observability.
Your AI tools (Claude Code, Cursor, Roo Code) connect to Requesty with a single API key. Requesty connects to your MCP servers (GitHub, Notion, Linear) with authenticated credentials. All analytics flow to a single dashboard.
Centralized auth — Manage all MCP server credentials in one place. Organization wide keys for shared services, per user keys for personal accounts (Enterprise).
Tool whitelisting — Select exactly which tools from each server to expose via MCP Server Management. Reduces context window usage and prevents agents from accessing tools they should not use.
Usage analytics — Track request volume, latency, success rates, and tool usage across all MCP servers and all AI tools in one dashboard.
Enterprise security — AES 256 encryption, organization isolation, TLS 1.3, and granular RBAC. Complete audit trail for compliance.
Setting up the MCP Gateway
Step 1: Enable the MCP Gateway. Navigate to Settings > Integrations > MCP Gateway in your Requesty dashboard.
Step 2: Add MCP servers. Use pre configured templates for popular services (GitHub, Notion, Linear, Context7) or add custom servers:
{
"name": "internal-db",
"url": "https://mcp.internal.company.com",
"type": "streamable-http",
"headers": {
"Authorization": "Bearer {{API_KEY}}"
}
}Step 3: Whitelist tools. Click Explore Server to discover available tools, then select only the ones your team needs. This keeps agent context windows clean and reduces the 72% schema overhead.
Step 4: Connect your AI tools. Configure Claude Code, Cursor, or any MCP compatible client to connect through Requesty. Claude Code automatically discovers MCP servers through your Requesty API key. For Cursor:
{
"mcp": {
"provider": "requesty",
"apiKey": "YOUR_REQUESTY_API_KEY"
}
}For Roo Code:
{
"mcp": {
"endpoint": "https://router.requesty.ai/mcp",
"auth": "Bearer YOUR_REQUESTY_API_KEY"
}
}Step 5: Monitor usage. Open the MCP Analytics dashboard to see real time metrics: request volume, average latency, success rates, and per tool usage breakdowns.
MCP + Agent SDKs: The full stack
MCP does not replace agent SDKs. It complements them. The agent SDK handles the agent loop (think, act, observe). MCP handles the tool layer (discover, invoke, return).
Here is how the three major SDKs integrate with MCP in 2026:
Claude Agent SDK + MCP
The Claude Agent SDK has native MCP support. You can define MCP servers directly in the agent configuration:
from claude_agent_sdk import ClaudeAgentOptions, tool, create_sdk_mcp_server
@tool("search_issues", "Search GitHub issues", {"query": str})
async def search_issues(args):
# Tool implementation
return {"content": [{"type": "text", "text": f"Found 5 issues for: {args['query']}"}]}
server = create_sdk_mcp_server(
name="project-tools",
version="1.0.0",
tools=[search_issues],
)
options = ClaudeAgentOptions(
mcp_servers={"tools": server},
allowed_tools=["mcp__tools__search_issues"],
)Route the underlying model calls through Requesty by setting ANTHROPIC_BASE_URL="https://router.requesty.ai". Now the agent uses MCP for tools and Requesty for model routing, giving you the best of both.
OpenAI Agents SDK + MCP
The OpenAI Agents SDK added first class MCP support in v0.12. MCP servers are treated as tool providers that the agent can discover and invoke:
from agents import Agent
from agents.mcp import MCPServerStreamableHTTP
async with MCPServerStreamableHTTP(
url="https://router.requesty.ai/mcp",
headers={"Authorization": "Bearer YOUR_REQUESTY_API_KEY"},
) as mcp_server:
agent = Agent(
name="Project Manager",
model="policy/frontier-with-fallback",
instructions="Help manage the project using available tools.",
mcp_servers=[mcp_server],
)Google ADK + MCP
ADK wraps MCP servers through its tool interface. Combined with A2A for agent to agent communication, you can build systems where local agents use MCP tools and remote agents communicate via A2A:
from google.adk import Agent
from google.adk.tools.mcp import MCPTool
github_tool = MCPTool(
server_url="https://router.requesty.ai/mcp",
headers={"Authorization": "Bearer YOUR_REQUESTY_API_KEY"},
)
agent = Agent(
name="developer",
model="google/gemini-2.5-pro",
instruction="Help with code reviews and issue management.",
tools=[github_tool],
)The MCP ecosystem by the numbers
| Metric | Value (May 2026) |
|---|---|
| Public MCP servers | 10,000+ |
| Monthly SDK downloads | 97 million |
| Supported languages | TypeScript, Python, Java, Kotlin, C#, Swift |
| Protocol versions | Streamable HTTP, SSE (stdio coming) |
| Foundation members | AWS, Anthropic, Google, Microsoft, OpenAI |
| AI tools with MCP support | Claude, ChatGPT, VS Code, Cursor, Goose, Roo Code |
Best practices for production MCP
Whitelist tools aggressively. Only expose tools your agents actually need. Every unnecessary tool consumes context window tokens and increases the chance of tool misuse. Requesty's MCP Server Management lets you pick exactly which tools to enable per server.
Use a gateway, not direct connections. Direct MCP connections scatter credentials, provide no observability, and bypass security policies. A centralized gateway gives you auth management, tool whitelisting, analytics, and audit trails in one layer.
Monitor latency per tool. Some MCP servers are slow. A database query tool might take 5 seconds while a file search takes 50ms. Use MCP Analytics to identify slow tools and optimize or replace them.
Rotate credentials regularly. MCP server API keys should be rotated every 90 days. With Requesty, you update the credential once in the gateway and every connected AI tool picks up the new key automatically.
Separate dev and prod MCP configs. Use different MCP server configurations for development and production. Development configs can be more permissive; production configs should whitelist only verified tools with proper auth.
What is coming next for MCP
The MCP roadmap for the rest of 2026 focuses on production hardening:
- Stdio protocol support: Direct process based MCP servers for local development
- Server discovery: Standardized way for clients to find and connect to MCP servers without manual configuration
- Long running tasks: Support for MCP operations that take minutes or hours, not milliseconds
- Enterprise authentication: OAuth and SSO integration at the protocol level
- Event driven triggers: MCP servers that push notifications to clients, not just respond to requests
Getting started
-
Sign up for Requesty. Create an account at app.requesty.ai and get your API key.
-
Enable the MCP Gateway. Go to Settings > Integrations > MCP Gateway in your dashboard.
-
Add your first MCP server. Start with GitHub, Notion, or Linear using the built in templates.
-
Connect your AI tools. Point Claude Code, Cursor, or Roo Code at Requesty. See the MCP Integration guide for step by step instructions.
Further reading
- MCP Gateway Overview — Full documentation for Requesty's MCP Gateway.
- MCP Server Management — Register and configure MCP servers for your organization.
- MCP Analytics — Monitor MCP server usage, performance, and user activity.
- MCP Integration Guide — Connect Claude Code, Cursor, and Roo Code to your MCP servers.
- Top MCP Gateways — How Requesty and other MCP gateways compare.
- Agent Harness — Why your LLM gateway is the backbone of production agents.
Frequently asked questions
- What is the Model Context Protocol (MCP)?
- MCP is a standardized client-server protocol built on JSON-RPC that lets AI agents discover and use external tools through a consistent interface. Instead of building custom integrations for each AI tool and each data source, MCP provides a universal connector format: M + N integrations instead of M × N.
- How many MCP servers are available in 2026?
- Over 10,000 public MCP servers are available as of May 2026, with 97 million monthly SDK downloads across TypeScript, Python, Java, Kotlin, C#, and Swift. Every major AI lab (Anthropic, OpenAI, Google, Microsoft) and foundation members like AWS support MCP.
- What is the 72% context window problem with MCP?
- A widely cited benchmark shows that 72% of an agent's context window gets consumed by MCP tool schemas alone when connecting to multiple servers. The solution is selective tool exposure: whitelist only the tools each agent actually needs instead of dumping every available tool into context.
- What are MCP Apps?
- MCP Apps is the biggest MCP extension of 2026. Tools can now return interactive UI components (dashboards, forms, visualizations) that render directly in the conversation. ChatGPT, Claude, VS Code, and Goose all ship support for MCP Apps.
- How does an MCP gateway help manage MCP at scale?
- An MCP gateway like Requesty's sits between your AI tools and your MCP servers, providing centralized authentication management, tool whitelisting to reduce context pollution, usage analytics across all tools and users, and enterprise security with AES-256 encryption and audit trails.
- OCT '25
Exploring MCP Gateways (2025): Find the best MCP for you
- MAY '26
Agent Harness: Why Your LLM Gateway Is the Backbone of Production Agents
The model is the brain. The harness is the body. In 2026 the agent harness has become the critical infrastructure layer for production AI. This post breaks down the stack and shows how an LLM gateway like Requesty fits in with real code examples.
- MAY '26
Agentic Coding Tools Compared (2026): Claude Code, Cursor, Codex, Aider, and the Gateway That Connects Them
Claude Code, Cursor 3, OpenAI Codex, Aider, Roo Code, and Cline are all shipping autonomous agents in 2026. Here is how they compare on architecture, pricing, benchmarks, and which LLM gateway they support.
- JAN '26
Routing policies 101: fallback, load balancing, and latency in production
The three routing-policy primitives every LLM gateway needs — fallback chains, weighted load balancing, and latency-based selection — and when to use each. Written for teams deploying multi-model production setups.

