MCP Goes Stateless: What the 2026-07-28 Spec Release Candidate Means for Your Servers

The Model Context Protocol 2026-07-28 release candidate eliminates handshakes and session state. Learn how the stateless core, extensions framework, and authorization hardening change production MCP server architectures.

What Changed in MCP 2026-07-28: From Stateful to Stateless

Most MCP server failures in production stem from assumptions about session state that the protocol never guaranteed. The 2026-07-28 specification release candidate eliminates this entire class of problems by making the protocol stateless at the core. No handshake, no session identifiers, no connection lifecycle to manage. Any request can hit any server instance. This is the first MCP specification that treats horizontal scaling and load balancing as first-class concerns instead of implementation details teams figure out after deployment.

The release candidate introduces three major changes: a stateless protocol core that removes initialize and session tracking, an Extensions framework for capability negotiation, and a Tasks API for long-running operations. The authorization model gets hardened with token scopes and the new MCP Apps concept provides a packaging standard. For teams running MCP servers at scale, the stateless core is the critical change. The rest extends what servers can do once the protocol no longer fights against distributed deployment.

This matters because the old stateful model forced teams into workarounds. Connection pooling, sticky sessions, and centralized state stores became mandatory for anything beyond a single-process server. The 2026-07-28 spec removes that entire burden. Servers become stateless request handlers that can scale horizontally without coordination.

Understanding the Stateless Protocol Core: No More Handshakes or Session IDs

The stateful MCP pattern required an initialize handshake before any operation. Clients sent an initialization request, servers responded with capabilities, and both sides tracked session state for subsequent requests. This created a tight coupling between connection lifecycle and protocol semantics. If a request arrived at a different server instance, the session state was missing and operations failed.

The stateless core removes the handshake entirely. Requests carry all necessary context in headers or request bodies. Servers expose capabilities through a discovery endpoint that clients can query independently. No session identifiers, no connection tracking, no state synchronization between instances.

%% alt: Comparison of stateful vs stateless MCP request flow
flowchart LR
    subgraph StatefulApproach["Stateful: session state required"]
        C1[Client] -->|1. initialize| S1[Server Instance 1]
        S1 -->|2. capabilities + session_id| C1
        C1 -->|3. tool_call with session_id| S1
        S1 -->|4. response| C1
        C1 -.->|5. new request| S2[Server Instance 2]
        S2 -.->|ERROR: no session| C1
        style S2 stroke:#ef4444,fill:#450a0a,color:#fca5a5
    end

    subgraph StatelessApproach["Stateless: any server handles any request"]
        C2[Client] -->|1. discover capabilities| LB[Load Balancer]
        LB --> S3[Server Instance 1]
        S3 -->|2. capability list| C2
        C2 -->|3. tool_call with context| LB
        LB --> S4[Server Instance 2]
        S4 -->|4. response| C2
        C2 -->|5. new request| LB
        LB --> S3
        S3 -->|6. response| C2
    end

    classDef framework fill:#0b3b2e,stroke:#34d399,color:#d1fae5
    classDef uiComponent fill:#2a1840,stroke:#c084fc,color:#f3e8ff
    class LB framework
    class C1,C2 uiComponent

The implication here is that servers no longer need to maintain connection state. The failure mode in the stateful approach happens when load balancers distribute requests across instances without sticky sessions. The stateless approach eliminates this failure mode entirely.

Discovery endpoints replace the initialization handshake. Clients fetch capabilities once and cache them. Servers can update capabilities independently and clients refresh on demand. This separation allows servers to evolve capabilities without protocol-level coordination.

MCP stateless architecture diagram

The authorization model shifts to per-request validation. Tokens carry scope information and servers validate each request independently. No session context to maintain, no token refresh logic tied to connection lifecycle. Servers verify the token, check scopes against the requested operation, and respond. The client retries with a new token if authorization fails.

Migrating Your Existing MCP Server: Before and After Code Examples

The migration path from stateful to stateless MCP servers requires removing session tracking and refactoring initialization logic into discovery endpoints. Most teams will need to extract session-specific state into request context and modify authentication to work per-request instead of per-connection.

Here is a typical stateful MCP server structure that teams are running in production today:

// Stateful MCP server (pre-2026-07-28)
class StatefulMCPServer {
  private sessions = new Map<string, SessionContext>();
 
  async initialize(request: InitializeRequest): Promise<InitializeResponse> {
    const sessionId = crypto.randomUUID();
    this.sessions.set(sessionId, {
      clientInfo: request.clientInfo,
      capabilities: this.getCapabilities(),
      createdAt: Date.now()
    });
    
    return {
      sessionId,
      serverInfo: { name: "example-server", version: "1.0.0" },
      capabilities: this.getCapabilities()
    };
  }
 
  async handleToolCall(request: ToolCallRequest): Promise<ToolCallResponse> {
    const session = this.sessions.get(request.sessionId);
    if (!session) {
      throw new Error("Invalid session");
    }
    
    // Session-dependent logic
    const result = await this.executeTool(request.name, request.arguments);
    return { result };
  }
}

The stateless version eliminates the session map and moves all context into the request:

// Stateless MCP server (2026-07-28)
class StatelessMCPServer {
  async discoverCapabilities(): Promise<CapabilitiesResponse> {
    return {
      serverInfo: { name: "example-server", version: "2.0.0" },
      capabilities: {
        tools: this.getAvailableTools(),
        resources: this.getAvailableResources(),
        extensions: ["tasks", "batch-operations"]
      }
    };
  }
 
  async handleToolCall(request: StatelessToolCallRequest): Promise<ToolCallResponse> {
    // Validate authorization token
    const auth = await this.validateToken(request.authorization);
    if (!auth.scopes.includes(request.name)) {
      throw new Error("Insufficient permissions");
    }
    
    // All context in request, no session lookup
    const result = await this.executeTool(
      request.name,
      request.arguments,
      { authorization: auth }
    );
    return { result };
  }
}

The critical difference is that handleToolCall no longer depends on session state. The authorization token carries all necessary context and the server validates it per-request. This pattern allows any server instance to handle any request without coordination.

Teams should refactor authentication middleware to extract token validation into a reusable function. The stateless model makes this natural because validation happens identically for every request. Session-specific caching moves from server memory to the client or an external cache layer.

The Extensions Framework and Tasks: New Capabilities for MCP Servers

The Extensions framework provides a capability negotiation system that servers and clients use to advertise optional features. Instead of a fixed protocol surface, servers declare which extensions they support and clients adapt their behavior accordingly. The Tasks extension is the first major addition: a standard API for long-running operations that return progress updates.

Extensions work through capability discovery. Servers list supported extensions in the discovery response and clients check for required extensions before attempting operations. This decouples protocol evolution from breaking changes. New extensions can be added without requiring all servers to implement them immediately.

%% alt: Extension negotiation and task execution flow
flowchart TD
    Client[Client requests capabilities]
    Client --> Discover[Discovery endpoint]
    Discover --> Check{Extensions supported?}
    Check -->|tasks extension present| TaskInit[Initialize task]
    Check -->|extension missing| Fallback[Use synchronous operation]
    
    TaskInit --> Poll[Poll task status]
    Poll --> Progress{Task complete?}
    Progress -->|in progress| Poll
    Progress -->|complete| Result[Receive result]
    Progress -->|failed| Error[Handle error]
    
    Fallback --> SyncOp[Execute synchronously]
    SyncOp --> Result

    classDef userAction fill:#142544,stroke:#7c9cf0,color:#eaf2ff
    classDef framework fill:#0b3b2e,stroke:#34d399,color:#d1fae5
    classDef dataStore fill:#3a2f0b,stroke:#fbbf24,color:#fef3c7
    class Client,TaskInit userAction
    class Discover,Poll framework
    class Progress,Result,Error dataStore

The Tasks API addresses a long-standing gap in MCP: operations that take longer than a request timeout. Before 2026-07-28, teams implemented custom polling or webhook patterns. The Tasks extension standardizes this with a three-endpoint pattern: create task, query status, cancel task.

Tasks return a task identifier immediately and clients poll for completion. This works naturally with the stateless core because task state lives in a backing store, not in server memory. Any server instance can handle status queries or cancellation requests as long as they share access to the task store.

The extension model means servers can adopt Tasks incrementally. Servers that don't implement the Tasks extension fall back to synchronous operations. Clients that require Tasks can fail fast if the server doesn't advertise the extension. This prevents runtime failures from missing features.

Building a Stateless MCP Server from Scratch in TypeScript

A production-ready stateless MCP server needs three components: a discovery endpoint, request handlers with per-request validation, and a deployment configuration that supports horizontal scaling. The server should treat every request as independent and avoid any in-memory state that couples requests together.

Start with the discovery endpoint that declares capabilities:

import { FastifyInstance } from 'fastify';
 
export class StatelessMCPServer {
  constructor(private app: FastifyInstance) {
    this.registerRoutes();
  }
 
  private registerRoutes() {
    // Discovery endpoint
    this.app.get('/mcp/capabilities', async () => {
      return {
        serverInfo: {
          name: 'stateless-mcp-server',
          version: '1.0.0',
          protocol_version: '2026-07-28'
        },
        capabilities: {
          tools: [
            { name: 'fetch_data', description: 'Fetch external data' },
            { name: 'transform_data', description: 'Transform data' }
          ],
          resources: [
            { uri: 'db://users/*', methods: ['read'] }
          ],
          extensions: ['tasks']
        }
      };
    });
 
    // Tool execution endpoint
    this.app.post('/mcp/tools/call', async (request, reply) => {
      const { authorization, name, arguments: args } = request.body as any;
      
      // Per-request validation
      const auth = await this.validateToken(authorization);
      if (!this.hasToolPermission(auth, name)) {
        return reply.code(403).send({ error: 'Insufficient permissions' });
      }
 
      const result = await this.executeToolStateless(name, args, auth);
      return { result };
    });
 
    // Task creation endpoint (Tasks extension)
    this.app.post('/mcp/tasks', async (request, reply) => {
      const { authorization, operation, arguments: args } = request.body as any;
      
      const auth = await this.validateToken(authorization);
      const taskId = crypto.randomUUID();
      
      // Store task in external store (Redis, PostgreSQL, etc.)
      await this.taskStore.create(taskId, {
        operation,
        arguments: args,
        status: 'pending',
        createdBy: auth.userId,
        createdAt: Date.now()
      });
 
      // Queue async execution
      await this.taskQueue.enqueue({ taskId, operation, arguments: args });
      
      return { taskId, status: 'pending' };
    });
 
    // Task status endpoint
    this.app.get('/mcp/tasks/:taskId', async (request, reply) => {
      const { taskId } = request.params as any;
      const { authorization } = request.query as any;
      
      const auth = await this.validateToken(authorization);
      const task = await this.taskStore.get(taskId);
      
      if (!task || task.createdBy !== auth.userId) {
        return reply.code(404).send({ error: 'Task not found' });
      }
      
      return {
        taskId,
        status: task.status,
        progress: task.progress,
        result: task.result,
        error: task.error
      };
    });
  }
 
  private async validateToken(token: string) {
    // Validate JWT or API key per-request
    // No session lookup required
    const decoded = await this.authService.verify(token);
    return {
      userId: decoded.sub,
      scopes: decoded.scopes
    };
  }
 
  private hasToolPermission(auth: any, toolName: string): boolean {
    return auth.scopes.includes(`tool:${toolName}`) || 
           auth.scopes.includes('tool:*');
  }
 
  private async executeToolStateless(
    name: string, 
    args: any, 
    auth: any
  ): Promise<any> {
    // Execute without any session state
    switch (name) {
      case 'fetch_data':
        return this.fetchData(args.url, auth);
      case 'transform_data':
        return this.transformData(args.input, args.transformation, auth);
      default:
        throw new Error(`Unknown tool: ${name}`);
    }
  }
}

The server validates authorization on every request and stores task state in an external system. Any instance can handle any request because nothing lives in process memory. The Tasks extension demonstrates how to handle long-running operations without breaking the stateless model.

This pattern integrates naturally with build-mcp-server-typescript-claude for local development and extends to production deployment with minimal changes. The key insight is that stateless servers are fundamentally simpler to reason about and deploy.

Production Deployment Patterns: Load Balancing and Horizontal Scaling

Production MCP deployment with load balancer

The stateless protocol core enables production deployments that were impractical with the old stateful model. Servers can scale horizontally without sticky sessions or state synchronization. Load balancers distribute requests across instances using any algorithm because every instance handles every request type identically.

%% alt: Production MCP deployment with horizontal scaling
flowchart TD
    Client[MCP Client / AI System]
    Client -->|discover capabilities| LB[Load Balancer]
    Client -->|tool requests| LB
    Client -->|task operations| LB
    
    LB --> S1[Server Instance 1]
    LB --> S2[Server Instance 2]
    LB --> S3[Server Instance 3]
    
    S1 --> TaskStore[(Task Store)]
    S2 --> TaskStore
    S3 --> TaskStore
    
    S1 --> Queue[Task Queue]
    S2 --> Queue
    S3 --> Queue
    
    Queue --> W1[Worker 1]
    Queue --> W2[Worker 2]
    
    W1 --> TaskStore
    W2 --> TaskStore

    classDef userAction fill:#142544,stroke:#7c9cf0,color:#eaf2ff
    classDef framework fill:#0b3b2e,stroke:#34d399,color:#d1fae5
    classDef dataStore fill:#3a2f0b,stroke:#fbbf24,color:#fef3c7
    class Client userAction
    class LB,S1,S2,S3,W1,W2 framework
    class TaskStore,Queue dataStore

The deployment pattern separates request handling from task execution. Server instances handle HTTP requests and validate authorization. Workers pull tasks from a queue and update state in the task store. This separation allows scaling request handlers independently from workers.

Load balancer configuration becomes trivial because no session affinity is required. Round-robin, least connections, or any other algorithm works identically. If an instance fails, requests automatically route to healthy instances without client intervention.

The critical infrastructure component is the shared task store. Redis works well for this because task state is small and access patterns are simple: create, read, update by task ID. PostgreSQL provides stronger consistency if tasks need transaction support. The choice depends on task complexity and failure recovery requirements.

Monitoring shifts from tracking session state to tracking request latency and task completion rates. Servers emit metrics for request handling time, authorization failures, and tool execution duration. Task workers emit metrics for queue depth, task processing time, and failure rates. These metrics provide visibility without requiring session tracking.

This deployment model integrates with claude-agent-sdk-mcp-production-typescript patterns for agentic systems that need reliable MCP server backends. The stateless design eliminates entire classes of distributed systems problems.

Authorization Hardening and MCP Apps: Security Considerations

The 2026-07-28 specification hardens authorization with token scopes and introduces MCP Apps for packaging and distribution. Token scopes allow fine-grained permission control at the tool and resource level. MCP Apps provide a standard format for distributing servers with their authorization requirements declared upfront.

Token scopes map directly to MCP capabilities. A scope like tool:fetch_data grants permission to call the fetch_data tool. A scope like resource:db://users/*:read grants read access to user resources. Servers validate these scopes on every request and reject operations that exceed granted permissions.

%% alt: Authorization flow with scoped tokens
flowchart TD
    Client[Client with token]
    Client -->|request with Authorization header| Server[MCP Server]
    Server --> Validate[Validate token signature]
    Validate --> Extract[Extract scopes from token]
    Extract --> Check{Required scope present?}
    Check -->|scope matches operation| Execute[Execute operation]
    Check -->|scope missing| Reject[Return 403 Forbidden]
    Execute --> Response[Return result]
    Reject --> Response

    classDef userAction fill:#142544,stroke:#7c9cf0,color:#eaf2ff
    classDef framework fill:#0b3b2e,stroke:#34d399,color:#d1fae5
    class Client userAction
    class Server,Validate,Extract,Execute framework
    style Reject stroke:#ef4444,fill:#450a0a,color:#fca5a5

The failure mode here is subtle but expensive: tokens without scope validation allow privilege escalation. A client with permission to call one tool can call any tool if the server doesn't check scopes. The 2026-07-28 spec makes scope validation mandatory.

MCP Apps standardize packaging for distribution. An MCP App includes the server binary, a manifest declaring required capabilities and permissions, and deployment metadata. This allows clients to evaluate security requirements before connecting. The manifest format prevents servers from requesting broader permissions than declared.

The security model assumes tokens are short-lived and clients refresh them frequently. Servers do not cache authorization decisions beyond the current request. This prevents stale permissions from lingering after token revocation. For production systems, integrate with OAuth 2.0 or similar token infrastructure to handle refresh flows.

Teams deploying MCP servers should implement rate limiting per-token in addition to per-IP. This prevents compromised tokens from overwhelming servers. The stateless model makes rate limiting natural because tokens are the only request identifier that persists across instances.

Should You Migrate? When the 2026-07-28 Spec Makes Sense

The 2026-07-28 specification eliminates complexity for teams running MCP servers in distributed environments. Migrate if your deployment uses multiple server instances, requires horizontal scaling, or struggles with session state management. The stateless core provides immediate value in these scenarios.

Hold off if your server runs as a single process for a single client. The stateful model works fine in that case and migration overhead exceeds the benefits. Wait until you need scaling or until the ecosystem fully adopts the new specification.

The Extensions framework future-proofs servers against protocol evolution. Even if you don't implement Tasks immediately, declaring extension support allows clients to adapt. This matters more as the ecosystem builds on the 2026-07-28 foundation.

That covers the essential patterns for adopting the stateless MCP specification. The core change is simpler servers that scale horizontally without state synchronization. Extensions add capabilities without breaking existing deployments. Authorization gets hardened with scopes. Apply these patterns in production and the difference in operational simplicity will be immediate. For teams packaging MCP-enabled tools, see claude-code-plugin-packaging-guide for distribution best practices.