The model

The Agent Operational Maturity Model

Every organisation running AI agents is on the same curve — from a few people experimenting in private to agents governed and controlled at runtime. Five stages, what each looks like, and where real companies sit today.

Immature vs. mature, measured

Six control surfaces decide whether an agent can safely run in production.

Control surface ImmatureStage 01–02 MaturingStage 03 MatureStage 04–05
Inventory No list. Nobody can say which agents exist, what they touch, or where they run. A partial register — the important agents are tracked, the long tail is not. A complete, current registry. Every agent has a purpose, an owner, and a defined scope of access.
Ownership Personal accounts. No named human is accountable for any agent’s behaviour. Owners assigned to some agents, inconsistently, with gaps at handover. Every agent has a named owner accountable for behaviour, change, incident response, and risk acceptance.
Runtime visibility None. What an agent does between input and output is invisible. Some logging, reviewed after the fact when someone goes looking. Every tool call, data access, and decision is logged, attributable, and visible in real time.
Access control Personal tokens with broad scopes. Agent identity is indistinguishable from a user’s. Scopes assigned, but static and hard to change without touching code. Least-privilege scopes granted, narrowed, and revoked without code changes. Agent identity is distinct from user identity.
Evaluations None, or a single check before launch that is never repeated. Ad hoc pre-deployment testing, with little visibility once an agent is live. Quality, safety, and policy compliance tested continuously in production. Drift and regressions caught automatically.
Governance No policy, no audit path, no way to stop an agent cleanly. Reviews and policy exist, but enforcement drifts behind real behaviour. Runtime enforcement — policy that blocks, scopes, or escalates the instant an agent acts.

The five stages in detail

Each stage is defined by its primary risk and the single control that moves you on. You do not skip stages — you close the gap in front of you.

01 Experimental Isolated agent use by individuals.

A handful of engineers and analysts run agents from personal accounts to save themselves time. Nothing is registered, reviewed, or shared. Ask "which agents are running here?" and nobody can answer.

What makes it different

Nothing is written down. Capability lives in individual heads and personal API keys.

Primary risk

No inventory, personal tokens, no review. Agents are shadow infrastructure from day one.

Next control to fix

Start an inventory — even a spreadsheet. You cannot govern what you cannot list.

Explore Stage 01 →
02 Shared Agents reused across teams.

One useful agent becomes four. It spreads from the team that built it into marketing, sales, support, and finance. People copy prompts and tokens; everyone has a slightly different version.

What makes it different

Agents are now organisational tools, but ownership is diffuse and identity is borrowed from whoever set them up.

Primary risk

Fragmented ownership and inconsistent standards. No single team is accountable for any given agent.

Next control to fix

Assign named owners. Move agents off personal tokens and onto scoped, revocable identities.

Explore Stage 02 →
03 Operational Agents touch systems, APIs, and data.

Agents are wired into the CRM, the data warehouse, file storage, and internal APIs. They move real work and affect real records — but there is no way to see or intervene while they do it.

What makes it different

Agents have left the sandbox. They read and write production systems, so their mistakes are now production incidents.

Primary risk

Workflow impact without runtime intervention. When something goes wrong, you find out afterwards, if at all.

Next control to fix

Add runtime visibility. Log every tool call, data access, and decision — attributable and in real time.

Explore Stage 03 →
04 Governed Reviews, owners, audits, and evaluations exist.

There is a registry, named owners, change reviews, and evaluations. The controls exist on paper and in process. The hard part is keeping them complete as the agent estate keeps growing.

What makes it different

The organisation can describe how every agent should behave — the remaining gap is between the policy and the running system.

Primary risk

Coverage gaps and control drift. Policies exist, but enforcement lags behind what agents are actually doing.

Next control to fix

Add runtime enforcement — policies that block, scope, or escalate at the moment of action, not after the fact.

Explore Stage 04 →
05 Production-Ready Agents are monitored and controlled at runtime.

Every agent is inventoried, owned, observable, access-controlled, and continuously evaluated. Policies are enforced as actions happen. Residual risk is measured, accepted deliberately, and reviewed.

What makes it different

Agents are treated like any other piece of production infrastructure: governed at runtime, not trusted on faith.

Primary risk

Residual risk is measured and managed — the failure modes are known, bounded, and rehearsed rather than discovered.

Next control to fix

Maintain. Quarterly evaluations, drift checks, and incident postmortems keep the estate from sliding back.

Explore Stage 05 →

Real organisations on the curve

You can read an organisation’s maturity from the roles it hires. Below, companies from our live Jobs Index (updated 16 June 2026) placed by the most advanced agentic role they are staffing.

04 → 05 Governed → Production-Ready

Hiring agent-ops, evaluations, AI-governance, and AI-security roles — the controls that keep agents safe at scale.

Frontier · AI-native
AnthropicDatabricksOpenAIAdobeNVIDIALangChainScale AISnowflake
Enterprise
CitiCapital OneNetflixAutodeskAccentureGeneral MotorsWalmartBooz Allen
03 Operational

Hiring agent engineers, forward-deployed engineers, and solutions architects — shipping agents as production software.

Frontier · AI-native
SalesforcePalantirOktaNotionDecagonSierraElevenLabsGitLab
Enterprise
DisneyVisaCignaPfizerHPTravelersSierra Nevada CorpNike
01 → 02 Experimental → Shared

Hiring foundational AI engineers and AI product managers — the first agent builds, before operational controls.

Frontier · AI-native
Layer HealthWorld LabsEliseAILettaGeneralist AIThe Bot CompanyFactoryTypeface
Enterprise
AeroVironmentWolfspeedRegions BankTokyo Electron

Placement reflects hiring signal, not a private audit — the strongest public evidence of maturity available. See the full method on the Jobs Index.

What Stage 05 actually requires

A production-ready agent has five properties. Without them you cannot measure trust — or manage risk.

01

Inventoried

Registered with a purpose, an owner, and a defined scope of access.

02

Owned

A named human is accountable for its behaviour, change management, and incident response.

03

Observable

Every action it takes is logged, attributable, and visible in real time.

04

Access-controlled

Least-privilege permissions, granted and revoked without code changes.

05

Continuously evaluated

Quality, safety, and compliance tested in production — not just before launch.

The maturity model, answered directly.

Short answers for teams placing themselves on the agent operational maturity curve.

What is the Agent Operational Maturity Model?

It is a five-stage model that maps how ready an organisation is to run AI agents as production infrastructure — from Experimental (isolated individual use) through Shared, Operational, and Governed, to Production-Ready (agents monitored and controlled at runtime). Each stage is defined by its primary risk and the single control that moves you to the next one.

What does agent operational maturity mean?

Agent operational maturity describes how ready an organisation is to run AI agents as operational infrastructure. It covers inventory, ownership, runtime visibility, access control, evaluations, auditability, policy enforcement, incident response, and compliance readiness.

What separates an immature organisation from a mature one?

An immature organisation cannot list its agents, has no named owners, no runtime visibility, broad personal-token access, no evaluations, and no enforcement. A mature organisation has a complete registry, accountable owners, real-time observability, least-privilege identity, continuous evaluation, and runtime policy enforcement. The difference is whether agents are treated as shadow tools or as governed production infrastructure.

How do I assess my organisation’s agent maturity?

Measure six control surfaces — inventory, ownership, runtime visibility, access control, evaluations, and governance — against the five-stage ladder. Most enterprises sit between Stage 02 (Shared) and Stage 03 (Operational). The free assessment scores you in about seven minutes and names the highest-risk gap to close next.

What does production-ready mean for an AI agent?

A production-ready AI agent is inventoried, owned, observable at runtime, access-controlled, continuously evaluated, and governed by policy that is enforced at the moment of action.

Where does your organisation sit?

Take the free Agent Operational Maturity Assessment — seven minutes, no email required to see your stage and the highest-risk gap to close next.

Take the assessment →