MCP

How to build an MCP server

Goal

Expose a system to AI agents through a [Model Context Protocol](/mcp) server — with tools an agent can actually use well, credentials it cannot leak, and logging that tells you what it did.

Before you start

  • A target system with an API or library you can call, and a credential scoped for the access you intend to expose
  • A development machine with Python or Node.js — the two SDKs with the deepest support
  • An MCP client to test against (Claude Desktop, Claude Code, Cursor, or VS Code all speak the protocol)

Steps

  1. 1

    Decide what the server exposes — as verbs, not endpoints

    List the five to ten things an agent should be able to do with your system, phrased as actions: search_invoices, create_ticket, get_customer. Resist mirroring your REST API one-to-one — an agent choosing between forty thinly-described endpoints performs worse than one choosing between eight purposeful tools, and every tool you expose is permission surface you now own. Start with reads; add writes once the reads behave.

  2. 2

    Scaffold with an official SDK

    Use the official Python or TypeScript SDK rather than implementing the protocol by hand — the specification at modelcontextprotocol.io covers the wire format, but the SDKs handle sessions, capability negotiation, and transport so your code is mostly tool definitions. A minimal server is a few dozen lines: define tools, give each a name, description, and input schema, and register handlers.

  3. 3

    Write tool descriptions as the interface they are

    The model reads your tool names, descriptions, and parameter schemas to decide what to call and how — they are an API contract with a non-deterministic consumer. State what each tool does, when to use it, and what it returns; constrain parameters with enums and formats rather than prose; and say what the tool must NOT be used for when misuse is plausible. A vague description is a bug that manifests as the agent calling the wrong tool.

  4. 4

    Keep credentials on the server side

    The server holds the credential for the target system; the client and the model never see it. Scope that credential to exactly what the tools do — a read-only server gets a read-only key. If the server will act on behalf of different users, resolve the user's own authorisation per session instead of wielding one god-token for everyone; the audit trail downstream should distinguish who the agent was working for.

  5. 5

    Choose the transport for where it will run

    stdio runs the server as a local child process of the client — right for personal and development use, and the default most examples assume. Streamable HTTP makes it a network service multiple agents can reach — which is the moment it needs real authentication (OAuth is the spec's answer), TLS, and the threat model of any other exposed service. Build local-first, but decide before anyone else depends on it which deployment you are actually supporting.

  6. 6

    Test with a real client, then try to break it

    Wire the server into an MCP client and watch an agent use it on real tasks — the MCP Inspector tool lets you exercise tools directly when you need to isolate behaviour. Then probe the failure modes: malformed parameters, gigantic results, the model calling tools in an order you did not anticipate. How the server fails matters as much as how it works; an error message that explains itself gets corrected by the agent, a stack trace gets retried forever.

  7. 7

    Log every invocation and register the server

    Record each tool call — which tool, which parameters, which session, what was returned — because your server is now part of an agent's audit story and will be asked questions during someone else's incident. Then register it wherever your organisation tracks agent infrastructure: owner, target system, credential scope, deployment. An MCP server nobody knows about is shadow infrastructure with a credential inside.

Common pitfalls

  • Mirroring the API instead of designing tools. Forty endpoint-shaped tools maximise both the agent's confusion and your permission surface; the work is in choosing the eight that matter.
  • Fat tools that do everything via a mode parameter — they defeat permission scoping, because you can no longer grant the safe half without the dangerous half.
  • Treating descriptions as documentation rather than interface. The model acts on what the description says; nobody reviews it after the demo works, and behaviour drifts when it is wrong.
  • Assuming stdio forever. The jump from local child process to shared HTTP service changes the threat model completely, and it tends to happen by enthusiasm rather than decision.
  • Returning raw, unbounded results. A 50,000-row query result blows the agent's context and your token bill; cap, paginate, and summarise on the server side.

Frequently asked questions

Python or TypeScript for an MCP server?

Both SDKs are first-party and current; choose the language your team can operate in production. Python dominates the examples and data tooling; TypeScript fits teams already running Node services. The design work — tool surface, descriptions, credential scoping — transfers unchanged.

How is an MCP server different from just giving the agent my REST API?

Mechanically it is a thin layer; the difference is who the interface is designed for. REST endpoints assume a developer reading docs; MCP tools assume a model reading descriptions mid-task. The server is where you curate which capabilities an agent gets, under which credential, with which guardrails — that curation is the point, not the protocol.

Is your organisation ready for AI agents?

Take the assessment →