Governance

How to write a risk profile for an AI agent

Goal

Produce a risk profile for one agent: a short, structured document that states what the agent can do, what could go wrong, and which controls bound the damage — concrete enough to drive runtime policy, short enough to stay current.

Before you start

The agent is in your registry with an owner and a one-sentence purpose
A list of the tools and systems the agent can call
Whoever operates the agent is in the room when you write it

Steps

1

Enumerate the action surface

List every action the agent can take, tool by tool: each API it can call, each record type it can modify, each external party it can contact. This list, not the agent's intended purpose, is the basis of the profile. Risk lives in what is possible, not in what is planned.
2

Mark reversibility and blast radius

For each action, note whether it can be undone and what the worst plausible outcome is. Sending an internal Slack message and initiating a payment can sit in the same tool list with wildly different consequences. Irreversible plus external plus high-value is the combination that deserves the most attention.
3

Name the failure modes that matter

Write down the three to five specific ways this agent harms you: wrong-recipient communication, over-refund, data exposure to the model provider, runaway spend, prompt injection steering tool calls. Generic risks ("hallucination") produce generic controls; specific failure modes produce policies you can actually enforce.
4

Attach a control to each failure mode

For each failure mode, state the bounding control: spend and rate limits, allowlists, approval gates on defined actions, input sanitisation at the tool boundary, output checks before external sends. A failure mode with no control is a decision to accept risk — fine, but write that down too, with a name next to it.
5

Set the escalation and kill path

Name who gets paged when the agent acts outside its profile, how the agent is paused (and how fast that takes effect), and what happens to in-flight tasks when it is. An untested kill switch is a hypothesis, so test it.
6

Put a review trigger on it

The profile is stale the moment the agent's model, prompt, or tool list changes. Tie review to those events rather than to the calendar: any change to the action surface reopens the profile. Quarterly review catches the drift that event triggers miss.

Common pitfalls

Writing the profile for the agent you intend instead of the action surface that exists
Twenty-page profiles nobody updates; one page that is current beats a binder that is not
Controls that exist in the document but not in any enforcement system
No named owner for accepted risks

Before you start

Steps

Enumerate the action surface

Mark reversibility and blast radius

Name the failure modes that matter

Attach a control to each failure mode

Set the escalation and kill path

Put a review trigger on it

Common pitfalls