The perceive-reason-act loop
At each step, the agent has a context window containing the original task, results from previous steps, and any relevant memory. The language model reads this context and produces a decision: call a tool, produce an output, ask a clarifying question, or declare the task complete. That decision is executed by the agent framework, the result is added to the context, and the loop runs again. The loop continues until the model outputs a stop signal or an external limit is reached. The model itself does not execute actions — it decides what to execute, and the framework executes it.
Tools and the environment
Tools are what make agents capable of affecting the world outside the model. A tool is a function with a name, a description, and a parameter schema — information the model reads to know what the tool does and how to call it. Tools can be anything the framework exposes: web search, code execution, database queries, file reads and writes, API calls, or invocations of other agents. The set of tools available to an agent defines the scope of what it can do. Limiting that set is the primary mechanism for limiting an agent's blast radius.
Memory and state
Agents maintain state in several ways. In-context memory is whatever fits in the current context window — sufficient for short tasks but expensive and limited by window size. External memory uses a retrieval step to load relevant past information into context at each step. Long-term state is stored in databases or files that persist across sessions. Different task types need different memory strategies: a coding agent completing a single task needs little beyond in-context memory; a personal assistant operating across weeks needs durable external memory with retrieval.