Types of memory in LangChain

LangChain provides several memory types for different persistence and retrieval needs. Buffer memory stores the full conversation history up to a specified length and includes it verbatim in each prompt. Buffer window memory keeps only the most recent N turns, discarding older history when the window is full. Summary memory uses a language model to generate a running summary of the conversation, which is more token-efficient than verbatim history but loses detail. Vector store memory embeds conversation turns and retrieves the most relevant past exchanges for each new input, enabling selective retrieval from long histories. Entity memory tracks named entities mentioned in conversation and maintains a structured record of facts about them.

Session persistence and multi-session memory

In-process memory types in LangChain exist only for the duration of a single session and are lost when the process ends. For applications that need memory to persist across sessions — a personal assistant that remembers past conversations, a customer service agent that recalls previous cases — memory must be backed by an external store: a database, a file system, or a vector store that persists between runs. LangChain's memory interfaces are designed to be swapped between in-process and persistent backends, though the backend configuration and the data schema are the developer's responsibility.

Practical trade-offs when choosing a memory strategy

The choice of memory type involves trade-offs between completeness, token cost, and retrieval accuracy. Verbatim buffer memory preserves all detail but grows indefinitely and eventually exceeds context limits. Summary memory is compact but can lose important specifics and introduces a new model call for summarization. Vector retrieval is selective and scalable but requires deciding what to embed and how to score relevance — relevant history may not be retrieved if the retrieval query does not match how it was stored. For most production applications, the right choice is a combination: recent turns in a buffer for immediate context, vector retrieval for relevant earlier history, and a summary for background state.