I've Been Teaching MCP Wrong for 6 Months!
The Wake-Up Call
Alright. So at work, I’ve been consulting a few teams to design MCP servers.
I have a solution in prod and some teams were interested to learn from my experience.
All my experiments on my blog, YouTube channel, and the solution at work were based on SSE—meaning the transport layer of my MCP servers was Server-Sent Events.
However, recently when one of the devs I was consulting told me that they read SSE transport was deprecated and only STDIO and Streamable HTTP were the only choices, I strongly denied such updates and told them there’s no way I would miss such an update.
They shared the link with me and I just couldn’t believe I had missed that update. An update that had come in late March 2025.
I felt embarrassed as I was recommending SSE transport for all our projects.
Luckily, the change to streamable_http wasn’t big and we could easily get on with the changes.
But I realized staying current with rapidly evolving AI protocols isn’t optional—it’s critical.
Missing a single protocol update can mean building on deprecated foundations, giving outdated advice, and potentially creating scalability issues that won’t surface until production load hits.
I know I’m a bit late here—but let’s understand why SSE was deprecated and learn how to build MCP servers with Streamable HTTP transport.
Why SSE Was Deprecated
The SSE (Server-Sent Events) transport was part of the original MCP specification, but it had fundamental architectural problems that became apparent at scale:
1. Two-Endpoint Architecture
SSE required maintaining two separate endpoints:
GET
/sse
for server-to-client streamingPOST
/messages
for client-to-server requests
This split architecture created complexity in connection management and made deployment harder.
2. Connection Limit Hell
SSE connections are long-lived HTTP connections. Under load, servers quickly hit OS-level file descriptor limits (~1024 on most systems). Performance testing showed:
1000+ concurrent TCP connections with SSE
Execution times 4x slower than Streamable HTTP
Success rates dropping sharply after ~1024 concurrent users
3. No Resumability
If an SSE connection dropped, there was no way to resume from where you left off. The entire operation had to restart.
4. One-Way Communication
SSE only supports server-to-client streaming. Client-to-server communication required separate HTTP POST requests, creating an asymmetric architecture.
5. Cloud Deployment Nightmare
Long-lived connections don’t play well with:
Serverless functions (AWS Lambda, Cloud Run)
Load balancers with connection timeouts
Auto-scaling infrastructure
Container orchestration platforms
Enter Streamable HTTP
Streamable HTTP was introduced in protocol version 2025-03-26 and addresses all of SSE’s limitations. Here’s what makes it better:
Single Endpoint Architecture
Everything goes through one endpoint: /mcp
Connection Reuse
Instead of 1000 concurrent SSE connections, Streamable HTTP reuses TCP connections, keeping the count to just ~50 even under high load.
Flexible Response Format
The server chooses the response type based on the operation:
Quick operations → Instant JSON response
Long operations → SSE stream with progress updates
Stateless Mode
Perfect for cloud deployments—no session state to maintain, works beautifully with serverless and auto-scaling.
Migration from SSE to Streamable HTTP
The migration is surprisingly simple:
Python
# Old (SSE)
mcp.run(transport=”sse”)
# New (Streamable HTTP)
mcp.run(transport=”streamable-http”)
Client Code
# Old
from mcp.client.sse import sse_client
async with sse_client(url) as (read, write):
# ...
# New
from mcp.client.streamable_http import streamablehttp_client
async with streamablehttp_client(url) as (read, write, _):
# ...
That’s it. The tool definitions, resources, and prompts remain unchanged.
Key Takeaways
SSE is deprecated since protocol 2025-03-26 (March 26, 2025)
Streamable HTTP is the standard for production deployments
Migration is straightforward - mostly just changing transport configuration
Performance is significantly better - 4x faster with better scalability
Cloud-friendly - Works seamlessly with serverless and auto-scaling