The 7 Types of Agent Memory: A Technical Guide for AI Engineers
Large language fashions are stateless by default. Each API name begins contemporary. The mannequin forgets your final message as soon as the response returns. That is okay for a single query. It breaks the second you construct an agent.
Agents plan, name instruments, and run throughout many steps. They want to recollect. Memory is the infrastructure that fixes this. It turns a stateless mannequin right into a system that retains context. That system can be taught from expertise and act over time.
What is Agent Memory
Memory is any mechanism that carries data throughout a mannequin’s reasoning. Some of it lives contained in the context window. Some of it lives outdoors, in databases or mannequin weights. Each kind shops a special class of data for a special length.
Memory varies by type and by time. Form means parametric, saved in weights, or non-parametric, saved as textual content. Time means short-term or long-term. The seven varieties beneath map onto these two axes.
The Seven Types of Agent Memory
1. In-Context / Working Memory (Short-Term): This is all the things the mannequin can at the moment see inside its context window. It consists of the system immediate, latest messages, software outputs, and reasoning steps. Think of it as RAM. It is quick and important, however short-term and size-limited. Every different reminiscence kind competes for house right here.
2. Semantic Memory (Long-Term): This is a persistent retailer of information, preferences, and area data. It holds entries like “the person prefers Python over JavaScript.” The data is decoupled from when it was realized. It is the agent’s organized encyclopedia a couple of person or matter.
3. Episodic Memory (Long-Term): This logs particular previous occasions, full conversations, and activity runs. It information what labored and what failed. The agent makes use of it to be taught from expertise. Systems like Reflexion and ExpeL write verbal post-mortems and retailer conclusions for future runs.
4. Procedural Memory (Long-Term): This is the agent’s data of tips on how to do issues. It covers abilities, software utilization patterns, workflows, and behavioral guidelines. A assist agent dealing with its hundredth password reset doesn’t re-reason the workflow. It executes a realized process as an alternative.
5. External / Retrieval Memory (Short-Term + Long-Term): This is data saved outdoors the mannequin in a vector database. It is pulled into context at inference time utilizing similarity search. This is RAG utilized to agent historical past or paperwork. Retrieval high quality turns into the bottleneck quick.
6. Parametric Memory (Long-Term): This is data baked immediately into the mannequin’s weights throughout coaching. It holds language, reasoning patterns, and common world data. The mannequin doesn’t look something up. It generates from realized associations. The tradeoff is that this reminiscence is frozen at coaching time.
7. Prospective Memory (Short-Term + Long-Term): This is the agent’s skill to recollect future intentions and scheduled targets. It tracks issues the agent deliberate however has not but executed. It is important for long-horizon and multi-step planning brokers. Without it, an agent forgets its personal commitments.
Side-by-Side: How the Seven Compare
The desk beneath maps every kind to its timescale, location, and typical implementation.
| Memory kind | Timescale | Where it lives | What it shops | Common implementation |
|---|---|---|---|---|
| Working / In-context | Short-term | Context window | Prompt, messages, software outputs | Native to the LLM |
| Semantic | Long-term | External retailer | Facts, preferences, area data | Vector DB or profile schema |
| Episodic | Long-term | External retailer | Past occasions, activity runs, outcomes | Vector DB plus occasion logs |
| Procedural | Long-term | Prompt or weights | Skills, workflows, behavioral guidelines | System immediate or fine-tune |
| Retrieval / External | Both | Vector database | Documents, historical past chunks | RAG pipeline |
| Parametric | Long-term | Model weights | World data, language, reasoning | Pre-training or fine-tuning |
| Prospective | Both | State retailer | Future intentions, scheduled targets | Task queue or scheduler |
Interactive Explainer
Use Cases: Which Memory Solves Which Problem
Each kind solutions a concrete product want. Map the necessity to the reminiscence.
- A coding assistant inside one session makes use of working reminiscence. It tracks the open recordsdata and up to date edits in context. Close the session and that state is gone.
- A private assistant that remembers you wants semantic reminiscence. It shops “allergic to gluten” and recollects it subsequent week. The reality survives throughout classes.
- A analysis agent that improves over time wants episodic reminiscence. It recollects that threat sections landed nicely final month. It repeats what labored and avoids what failed.
- A travel-booking agent wants procedural reminiscence. It is aware of the circulation: search flights, evaluate, reserve, affirm. The sequence is a realized ability, not a contemporary plan.
- A documentation chatbot wants retrieval reminiscence. It embeds the docs and pulls related chunks per question. The reply stays grounded in retrieved textual content.
- A long-horizon agent managing a week-long mission wants potential reminiscence. It remembers to ship the Friday report. The intention persists till execution.
A Combined Example: All Seven in One Agent
Consider an autonomous market-analysis agent. One activity workout routines each reminiscence kind directly.
Parametric reminiscence provides the bottom reasoning and language. Retrieval reminiscence pulls present market knowledge from a vector retailer. Semantic reminiscence supplies the person’s most well-liked report format. Episodic reminiscence recollects which sources proved dependable earlier than. Procedural reminiscence drives the part order: sizing, then panorama, then threat. Prospective reminiscence schedules the follow-up draft for subsequent week. Working reminiscence assembles all of it into the lively context.
Remove anyone layer and the agent will get weaker. Each handles a job the others can not.
Implementation: A Minimal Memory Stack
Here is a stripped-down sketch in Python. It exhibits working, semantic, episodic, and procedural reminiscence as separate shops.
from datetime import datetime
# Semantic reminiscence: sturdy information in regards to the person
semantic_memory = {"food regimen": "vegetarian", "language_pref": "Python"}
# Episodic reminiscence: a log of previous occasions and outcomes
episodic_memory = [
{"timestamp": datetime.now(),
"event": "recipe_request",
"result": "user liked a 20-minute meal"},
]
# Procedural reminiscence: abilities the agent can execute
def suggest_recipe(food regimen):
return f"a fast {food regimen} recipe"
procedural_memory = {"suggest_recipe": suggest_recipe}
# Working reminiscence: assembled contemporary for every inference name
def build_context(question):
food regimen = semantic_memory["diet"]
final = episodic_memory[-1]["result"]
ability = procedural_memory["suggest_recipe"]
return (
f"Query: {question}n"
f"Semantic: person is {food regimen}n"
f"Episodic: final time, {final}n"
f"Procedural: returning {ability(food regimen)}"
)
print(build_context("counsel dinner"))
In manufacturing, the long-term shops transfer to a vector database. The sample stays the identical. Write to long-term reminiscence, retrieve into working reminiscence, then motive.
How to Layer Them: A Practical Build Order
Do not construct all seven directly. Add reminiscence solely when an actual want justifies the complexity.
- Start with working reminiscence. It ships with the mannequin. Most easy brokers want nothing extra.
- Add semantic reminiscence when customers count on the agent to recollect them throughout classes. This is the primary long-term layer most merchandise require.
- Layer in episodic, procedural, and potential reminiscence later. Add them solely when your agent should plan forward, be taught from failure, and adapt over time.
- Parametric and retrieval reminiscence are sometimes already current. Parametric reminiscence is the bottom mannequin itself. Retrieval reminiscence arrives the second you add RAG.
Sources: CoALA framework (Princeton, arXiv:2309.02427); “Memory within the Age of AI Agents” survey (arXiv:2512.13564); “From Human Memory to AI Memory” survey (arXiv:2504.15965); LangChain LangMem, MongoDB, Redis, and Neo4j agent-memory documentation; authentic idea notes on the seven reminiscence varieties.
The put up The 7 Types of Agent Memory: A Technical Guide for AI Engineers appeared first on MarkTechPost.
