Governance

How to write a risk profile for an AI agent

Goal

Produce a risk profile for one agent: a short, structured document that states what the agent can do, what could go wrong, and which controls bound the damage — concrete enough to drive runtime policy, short enough to stay current.

Before you start

  • The agent is in your registry with an owner and a one-sentence purpose
  • A list of the tools and systems the agent can call
  • Whoever operates the agent is in the room when you write it

Steps

  1. 1

    Enumerate the action surface

    List every action the agent can take, tool by tool: each API it can call, each record type it can modify, each external party it can contact. This list, not the agent's intended purpose, is the basis of the profile. Risk lives in what is possible, not in what is planned.

  2. 2

    Mark reversibility and blast radius

    For each action, note whether it can be undone and what the worst plausible outcome is. Sending an internal Slack message and initiating a payment can sit in the same tool list with wildly different consequences. Irreversible plus external plus high-value is the combination that deserves the most attention.

  3. 3

    Name the failure modes that matter

    Write down the three to five specific ways this agent harms you: wrong-recipient communication, over-refund, data exposure to the model provider, runaway spend, prompt injection steering tool calls. Generic risks ("hallucination") produce generic controls; specific failure modes produce policies you can actually enforce.

  4. 4

    Attach a control to each failure mode

    For each failure mode, state the bounding control: spend and rate limits, allowlists, approval gates on defined actions, input sanitisation at the tool boundary, output checks before external sends. A failure mode with no control is a decision to accept risk — fine, but write that down too, with a name next to it.

  5. 5

    Set the escalation and kill path

    Name who gets paged when the agent acts outside its profile, how the agent is paused (and how fast that takes effect), and what happens to in-flight tasks when it is. An untested kill switch is a hypothesis, so test it.

  6. 6

    Put a review trigger on it

    The profile is stale the moment the agent's model, prompt, or tool list changes. Tie review to those events rather than to the calendar: any change to the action surface reopens the profile. Quarterly review catches the drift that event triggers miss.

Common pitfalls

  • Writing the profile for the agent you intend instead of the action surface that exists
  • Twenty-page profiles nobody updates; one page that is current beats a binder that is not
  • Controls that exist in the document but not in any enforcement system
  • No named owner for accepted risks

Is your organisation ready for AI agents?

Take the assessment →