AI coding agents — Agentic Ready

What coding agents can and cannot do

Coding agents can read existing code and understand its structure, generate new code from a description of what it should do, apply edits across multiple files, run tests and interpret results, fix failures identified by tests or linters, and create commits. They are limited by the quality of their context: an agent that cannot read all relevant files cannot produce correct cross-file edits. They are also limited by task complexity — agents handle well-scoped tasks (fix this failing test, implement this function signature) more reliably than open-ended ones (refactor the entire authentication system). Code review and architectural decisions still require human judgment.

How they differ from AI code completion

AI code completion tools suggest the next token, line, or block as a developer types. A coding agent plans and executes a sequence of actions to complete a goal without step-by-step human direction. The developer using a completion tool remains in the driver's seat at every keystroke; the developer using a coding agent delegates a task and reviews the result. This changes the nature of the workflow: the agent needs a clear goal, access to the relevant files, and a way to run and verify its output — not just a prompt for the next line.

Risk and oversight considerations

Coding agents with write access to a repository and the ability to run arbitrary commands can cause significant damage if they make errors or are misdirected. Common risks include: overwriting files the agent was not intended to modify, introducing security vulnerabilities in generated code, making changes across many files that are difficult to review, and executing commands with unintended side effects. Mitigation approaches include running agents in sandboxed environments, limiting the scope of file system access, requiring human review of agent-generated commits before merge, and using the minimum permissions necessary for the task.

AI coding agents — FAQ

Can coding agents work on large, unfamiliar codebases?

With limitations. Coding agents perform best when they have relevant context — the files, functions, and dependencies related to the task. On large codebases, retrieval and context management become important: the agent needs a way to find the relevant code before it can work on it effectively. Agents that can search a codebase and retrieve relevant sections before acting generally outperform those that work only on files explicitly provided.

How do I verify the output of a coding agent before merging?

The same way you would verify any code change: read the diff, run the test suite, check that the change does what the task description said it should, and review any new dependencies or configuration changes. Coding agents sometimes produce code that passes tests but introduces subtle correctness or security issues — automated testing catches some problems but does not substitute for human review of the agent's changes before they enter production.