LangGraph deployment

LangGraph deployment refers to the infrastructure and operational practices for running LangGraph agent workflows in production — including the LangGraph Platform for managed deployment, containerized self-hosted options, state persistence configuration, streaming output setup, and the operational monitoring required for production agent reliability.

Deployment options for LangGraph workflows

LangGraph workflows can be deployed through several paths. The LangGraph Platform is a managed service that handles the infrastructure for running, scaling, and monitoring LangGraph applications — it provides an API server, a database for state persistence, and a deployment pipeline. Self-hosted deployment runs the LangGraph application in a container or on a server with the application code and a compatible database for state storage; this option gives more control over infrastructure and data residency at the cost of operational responsibility. For workflows without long-horizon state requirements, lightweight deployments using standard Python web frameworks are also viable if the use case fits a request-response model rather than a stateful long-running workflow.

State persistence in production

LangGraph's checkpointing system requires a persistent state store in production. The default in-memory saver is appropriate for development and testing but loses all workflow state when the process restarts. Production deployments configure an external store — a relational database, a key-value store, or the LangGraph Platform's managed storage — so that interrupted workflows can resume, human-in-the-loop checkpoints persist between interactions, and long-running workflows survive process restarts or scaling events. The state schema must be designed with persistence in mind: all state values need to be serializable to the storage backend, and state migrations need to be managed when the schema changes across deployments.

Streaming, scaling, and monitoring

LangGraph workflows support streaming intermediate outputs during execution, which allows clients to observe agent progress in real time rather than waiting for final results. Configuring streaming correctly in production requires handling the streaming transport at both the server and client level. Scaling LangGraph deployments involves managing concurrency: multiple simultaneous workflow executions require either stateless scaling (if state is fully externalized) or workflow-level routing to ensure each execution touches the same state store. Monitoring production LangGraph deployments requires capturing workflow-level metrics — execution duration, step counts, error rates by node, checkpoint frequency — in addition to standard infrastructure metrics.

LangGraph deployment — FAQ

How do I handle LangGraph workflow failures in production?

Failed workflows in production need defined recovery paths. For transient failures — temporary API unavailability, rate limit errors, network timeouts — implementing retry logic with exponential backoff at the node level handles most cases without interrupting the workflow. For persistent failures that cannot be retried automatically, the workflow should checkpoint the state at the last successful step so that a human can review and resume rather than restarting from scratch. Monitoring that alerts on workflow failure rates and specific node error rates enables proactive response before failures accumulate.

What are the main differences between the LangGraph Platform and self-hosting?

The LangGraph Platform provides managed infrastructure — the server, database, scaling, and monitoring are handled for you in exchange for using a hosted service. Self-hosting gives you control over the infrastructure, data residency, and costs but requires you to provision, configure, and operate the components yourself. The right choice depends on your team's infrastructure capacity, data handling requirements, and cost profile. Organizations with strict data residency requirements or existing infrastructure investment typically self-host; teams that want to focus on the application layer without infrastructure overhead use the platform.