How to secure MCP servers and clients
Lock down the MCP layer your agents depend on — vetted servers, authenticated connections, least-privilege tools, and injection-aware handling of what flows through them.
Before you start
- A list of the MCP servers in use, or the running guide's catalogue step done first
- Control over how clients are configured across the team — security that depends on every laptop's JSON file needs a distribution mechanism
- A secrets manager for server-side credentials
Steps
- 1
Treat every server as a supply-chain decision
An MCP server is code that runs with the credentials you hand it, so installing one is the same act as adding a dependency — with API keys attached. Maintain an allowlist of reviewed servers, pin their versions, and require review before anything new joins it. For community servers, read what the tools actually do before trusting what the README says they do; the gap between the two is the attack.
- 2
Authenticate both legs of every connection
Two connections exist per server and each needs its own answer. Client to server: fine for stdio's local child process, but the moment a server is reachable over the network it needs real authentication — OAuth is what the specification provides for user-facing flows, with signed tokens or mutual TLS for service-to-service. Server to target system: a scoped credential from your secrets manager, never from a config file on someone's laptop.
- 3
Scope the tool surface to least privilege
Permissions live at the tool level, so design them that way: read tools separated from write tools, no mode parameters that smuggle dangerous capability into safe-looking tools, and per-tool credentials where the target system allows it. The test is whether you can grant an agent the harmless half of a server without the consequential half — if you cannot, the tools are cut wrong.
- 4
Treat tool descriptions and tool results as attack surface
Two injection paths are specific to this layer. A malicious or compromised server can carry instructions in its tool descriptions — text every connected model reads — and any server can return results that contain injected instructions, which the agent then treats as context. Review descriptions at install and on every update, and treat server results like any retrieved content: data to be processed, never instructions to be followed, with the consequential actions gated downstream regardless.
- 5
Contain what a server can reach
Run servers with the isolation you would give any semi-trusted code: containers or sandboxes rather than the host shell, egress limited to the systems the tools actually need, and resource caps so a misbehaving server degrades rather than exhausts. Containment is what turns a bad server from an estate-wide incident into a revoked allowlist entry.
- 6
Log invocations with attribution and review on update
Every tool call gets logged with its caller, parameters, and result size — the per-caller part is what makes the log evidence rather than noise, and it requires per-agent authentication upstream. Then close the loop operationally: when a server updates, re-read its tool descriptions and re-run its contract tests before the new version reaches the allowlist, because a description change is a behaviour change for every agent that connects.
Common pitfalls
- Production credentials handed to an unvetted community server — the package-registry lesson, relearned with API keys attached.
- stdio assumptions carried to HTTP. The local child process needed no auth; the shared network service inherits that nothing, and ships reachable and open.
- One token shared by every caller, so the audit trail ends at "the server did it" — attribution has to be designed in, it does not emerge.
- Trusting tool results because the server is internal. The server is only as clean as the content it reads, and injection travels through results into the agent's context.
- Reviewing a server once at install while auto-updating it forever after. The version you vetted is not the version you are running.