MCP from first principles

MCP (at its core) is a standard for providing tools to LLMs. If I were to ask you to write a spec for MCP, it would probably look something like this:

A /list endpoint. This endpoint only accepts GET responses and returns a JSON response with the type { tools: { name: string; description: string; inputSchema: JSONSchema }[] }.
A /call endpoint. This endpoint only accepts POST responses with a type of { tool: string; input: JSON }, and returns a JSON response with the type string.

Yet it's many times more complex than this. To understand why, I first need to ask you how you'd write another spec: MCP with local servers. Since it would be wasteful to restart an application every tool call or run a pointless HTTP server, you'd use standard IO. Clients send in a numbered call, the server receives it on standard input, and it replies on standard output for the client to read. Simple enough, and MCP agrees with you on this - it likes streaming a lot. In fact, it likes it so much that remote MCPs also stream.

How would you write a spec for using streaming to make MCPs work remotely? You'd probably use WebSockets since this is a bidirectional stream. But here we flip back to MCP being many times more complex than you'd expect. There are two ways to connect to a remote MCP server:

Central SSE + separate calls. This was the first implementation of remote MCP, and makes each client receive responses through an always open connection to /sse and send requests by POSTing to /messages. This split is inherently problematic, which leads us to...
Streamable HTTP. It's a name that promises a return to the basics from the start of this post, but it in fact isn't. With most servers, everything works by POSTing something to /mcp and getting a stream back. Most will require you to initialize yourself to get a mcp-session-id, then use it for all other call requests. Most keep the stream open for 10 seconds after sending back the main data.
Missing: Simple stateless HTTP and WebSockets (I'd urge you to request them in this issue, and consider reading its author's post that led to it).

We can cope with this though. Streamable HTTP, however problematically designed, is usable. Take a look.

export type Tool = {
  name: string;
  description: string;
  inputSchema: unknown;
};
export type ToolContent =
  | { type: "text"; text: string }
  | { type: "image"; data: string; mimeType: string }
  | { type: "audio"; data: string; mimeType: string };

export const connect = async (url: string) => {
  let sessionId: string | null = null;
  const rpc = async (method: string, params: unknown) => {
    const headers: Record<string, string> = {
      accept: "application/json, text/event-stream",
      "content-type": "application/json",
    };
    if (sessionId) {
      headers["mcp-session-id"] = sessionId;
    }
    const body: Record<string, unknown> = {
      jsonrpc: "2.0",
      id: method.startsWith("notifications/") ? undefined : crypto.randomUUID(),
      method,
      params,
    };
    const r = await fetch(url, {
      method: "POST",
      headers,
      body: JSON.stringify(body),
    });
    if (!r.ok) {
      throw new Error(`${url} is ${r.status}ing`);
    }
    return r;
  };
  const parseFromRpc = async <T>(r: Response): Promise<{ result: T }> => {
    const contentType = r.headers.get("content-type");
    if (contentType == "application/json") {
      return await r.json();
    }
    if (contentType == "text/event-stream") {
      let buffer = "";
      const decoder = new TextDecoder();
      for await (const bytes of r.body!) {
        buffer += decoder.decode(bytes);

        let lines: string[];
        [lines, buffer] = [
          buffer.split("\n").slice(0, -1),
          buffer.split("\n").at(-1)!,
        ];
        for (const l of lines.map(
          (l) => l.startsWith("data: ") && l.slice(6).trim(),
        )) {
          if (!l) continue;
          if (l == "ping") continue;
          return JSON.parse(l);
        }
      }
    }
    throw new Error(`Unknown type ${contentType}`);
  };

  const initializeR = await rpc("initialize", {
    protocolVersion: "2025-03-26",
    capabilities: {},
    clientInfo: {
      name: "YOUR_CLIENT_HERE",
      version: "1.0.0",
    },
  });
  sessionId = initializeR.headers.get("mcp-session-id");

  await rpc("notifications/initialized", {});

  return {
    async list() {
      const r = await rpc("tools/list", {});
      const {
        result: { tools },
      } = await parseFromRpc<{ tools: Tool[] }>(r);
      return tools;
    },
    async call(name: string, args: unknown) {
      const r = await rpc("tools/call", { name, arguments: args });
      const {
        result: { content },
      } = await parseFromRpc<{ content: ToolContent[] }>(r);
      return {
        content,
        text: content
          .map((c) => ("text" in c ? c.text : ""))
          .reduce((acc, v) => acc + v),
      };
    },
  };
};

That was a minimal MCP-over-streamable HTTP client in <100 lines of TypeScript. It's not compatible with servers that don't respond to an RPC with a single, same-stream response, lacks authentication, doesn't support many parts of the MCP spec, and probably isn't suitable for production. But most importantly: it isn't bloated.