The Agent Operational Maturity Model

01 Experimental Isolated agent use by individuals. Explore → 02 Shared Agents reused across teams. Explore → 03 Operational Agents touch systems, APIs, and data. Explore → 04 Governed Reviews, owners, audits, and evaluations exist. Explore → 05 Production-Ready Agents are monitored and controlled at runtime. Explore →

Most enterprises sit between Stage 02 and Stage 03. Production-readiness lives at Stage 05.

What changes as you climb

Immature vs. mature, measured

Six control surfaces decide whether an agent can safely run in production.

Control surface	ImmatureStage 01–02	MaturingStage 03	MatureStage 04–05
Inventory	No list. Nobody can say which agents exist, what they touch, or where they run.	A partial register — the important agents are tracked, the long tail is not.	A complete, current registry. Every agent has a purpose, an owner, and a defined scope of access.
Ownership	Personal accounts. No named human is accountable for any agent’s behaviour.	Owners assigned to some agents, inconsistently, with gaps at handover.	Every agent has a named owner accountable for behaviour, change, incident response, and risk acceptance.
Runtime visibility	None. What an agent does between input and output is invisible.	Some logging, reviewed after the fact when someone goes looking.	Every tool call, data access, and decision is logged, attributable, and visible in real time.
Access control	Personal tokens with broad scopes. Agent identity is indistinguishable from a user’s.	Scopes assigned, but static and hard to change without touching code.	Least-privilege scopes granted, narrowed, and revoked without code changes. Agent identity is distinct from user identity.
Evaluations	None, or a single check before launch that is never repeated.	Ad hoc pre-deployment testing, with little visibility once an agent is live.	Quality, safety, and policy compliance tested continuously in production. Drift and regressions caught automatically.
Governance	No policy, no audit path, no way to stop an agent cleanly.	Reviews and policy exist, but enforcement drifts behind real behaviour.	Runtime enforcement — policy that blocks, scopes, or escalates the instant an agent acts.

Stage by stage

The five stages in detail

Each stage is defined by its primary risk and the single control that moves you on. You do not skip stages — you close the gap in front of you.

01 Experimental Isolated agent use by individuals.

A handful of engineers and analysts run agents from personal accounts to save themselves time. Nothing is registered, reviewed, or shared. Ask "which agents are running here?" and nobody can answer.

What makes it different

Nothing is written down. Capability lives in individual heads and personal API keys.

Primary risk

No inventory, personal tokens, no review. Agents are shadow infrastructure from day one.

Next control to fix

Start an inventory — even a spreadsheet. You cannot govern what you cannot list.

Explore Stage 01 →

02 Shared Agents reused across teams.

One useful agent becomes four. It spreads from the team that built it into marketing, sales, support, and finance. People copy prompts and tokens; everyone has a slightly different version.

What makes it different

Agents are now organisational tools, but ownership is diffuse and identity is borrowed from whoever set them up.

Primary risk

Fragmented ownership and inconsistent standards. No single team is accountable for any given agent.

Next control to fix

Assign named owners. Move agents off personal tokens and onto scoped, revocable identities.

Explore Stage 02 →

03 Operational Agents touch systems, APIs, and data.

Agents are wired into the CRM, the data warehouse, file storage, and internal APIs. They move real work and affect real records — but there is no way to see or intervene while they do it.

What makes it different

Agents have left the sandbox. They read and write production systems, so their mistakes are now production incidents.

Primary risk

Workflow impact without runtime intervention. When something goes wrong, you find out afterwards, if at all.

Next control to fix

Add runtime visibility. Log every tool call, data access, and decision — attributable and in real time.

Explore Stage 03 →

04 Governed Reviews, owners, audits, and evaluations exist.

There is a registry, named owners, change reviews, and evaluations. The controls exist on paper and in process. The hard part is keeping them complete as the agent estate keeps growing.

What makes it different

The organisation can describe how every agent should behave — the remaining gap is between the policy and the running system.

Primary risk

Coverage gaps and control drift. Policies exist, but enforcement lags behind what agents are actually doing.

Next control to fix

Add runtime enforcement — policies that block, scope, or escalate at the moment of action, not after the fact.

Explore Stage 04 →

05 Production-Ready Agents are monitored and controlled at runtime.

Every agent is inventoried, owned, observable, access-controlled, and continuously evaluated. Policies are enforced as actions happen. Residual risk is measured, accepted deliberately, and reviewed.

What makes it different

Agents are treated like any other piece of production infrastructure: governed at runtime, not trusted on faith.

Primary risk

Residual risk is measured and managed — the failure modes are known, bounded, and rehearsed rather than discovered.

Next control to fix

Maintain. Quarterly evaluations, drift checks, and incident postmortems keep the estate from sliding back.

Explore Stage 05 →

Where real companies sit

Real organisations on the curve

You can read an organisation’s maturity from the roles it hires. Below, companies from our live Jobs Index (updated 16 June 2026) placed by the most advanced agentic role they are staffing.

04 → 05 Governed → Production-Ready

Hiring agent-ops, evaluations, AI-governance, and AI-security roles — the controls that keep agents safe at scale.

Frontier · AI-native

AnthropicDatabricksOpenAIAdobeNVIDIALangChainScale AISnowflake

Enterprise

CitiCapital OneNetflixAutodeskAccentureGeneral MotorsWalmartBooz Allen

03 Operational

Hiring agent engineers, forward-deployed engineers, and solutions architects — shipping agents as production software.

Frontier · AI-native

SalesforcePalantirOktaNotionDecagonSierraElevenLabsGitLab

Enterprise

DisneyVisaCignaPfizerHPTravelersSierra Nevada CorpNike

01 → 02 Experimental → Shared

Hiring foundational AI engineers and AI product managers — the first agent builds, before operational controls.

Frontier · AI-native

Layer HealthWorld LabsEliseAILettaGeneralist AIThe Bot CompanyFactoryTypeface

Enterprise

AeroVironmentWolfspeedRegions BankTokyo Electron

Placement reflects hiring signal, not a private audit — the strongest public evidence of maturity available. See the full method on the Jobs Index.

The destination

What Stage 05 actually requires

A production-ready agent has five properties. Without them you cannot measure trust — or manage risk.

Inventoried

Registered with a purpose, an owner, and a defined scope of access.

Owned

A named human is accountable for its behaviour, change management, and incident response.

Observable

Every action it takes is logged, attributable, and visible in real time.

Access-controlled

Least-privilege permissions, granted and revoked without code changes.

Continuously evaluated

Quality, safety, and compliance tested in production — not just before launch.

Common questions

The maturity model, answered directly.

Short answers for teams placing themselves on the agent operational maturity curve.

What is the Agent Operational Maturity Model?

It is a five-stage model that maps how ready an organisation is to run AI agents as production infrastructure — from Experimental (isolated individual use) through Shared, Operational, and Governed, to Production-Ready (agents monitored and controlled at runtime). Each stage is defined by its primary risk and the single control that moves you to the next one.

What does agent operational maturity mean?

Agent operational maturity describes how ready an organisation is to run AI agents as operational infrastructure. It covers inventory, ownership, runtime visibility, access control, evaluations, auditability, policy enforcement, incident response, and compliance readiness.

What separates an immature organisation from a mature one?

An immature organisation cannot list its agents, has no named owners, no runtime visibility, broad personal-token access, no evaluations, and no enforcement. A mature organisation has a complete registry, accountable owners, real-time observability, least-privilege identity, continuous evaluation, and runtime policy enforcement. The difference is whether agents are treated as shadow tools or as governed production infrastructure.

How do I assess my organisation’s agent maturity?

Measure six control surfaces — inventory, ownership, runtime visibility, access control, evaluations, and governance — against the five-stage ladder. Most enterprises sit between Stage 02 (Shared) and Stage 03 (Operational). The free assessment scores you in about seven minutes and names the highest-risk gap to close next.

What does production-ready mean for an AI agent?

A production-ready AI agent is inventoried, owned, observable at runtime, access-controlled, continuously evaluated, and governed by policy that is enforced at the moment of action.

The Agent Operational Maturity Model

Immature vs. mature, measured

The five stages in detail

Real organisations on the curve

What Stage 05 actually requires

Inventoried

Owned

Observable

Access-controlled

Continuously evaluated

The maturity model, answered directly.

Where does your organisation sit?