Persistent Memory
Persistent memory gives AI agents the ability to store, search, and retrieve long-term context that survives restarts, spans sessions, and can be shared between agents.
Why Agents Need Persistent Memory
Without persistent memory, every agent session starts from scratch. The agent cannot remember:
- What it learned in previous conversations
- Customer preferences it discovered
- Facts it extracted from data analysis
- Decisions it made and why
HatiData's memory system solves this with a hybrid architecture that combines structured SQL metadata with vector similarity search — giving agents both precise filtering and semantic understanding.
How It Works
Agent → store_memory → HatiData → stored (metadata + vector embedding)
Agent → search_memory → HatiData → ranked results
- Structured storage holds the authoritative records: content, metadata, agent identity, timestamps, and access counts
- Built-in vector search stores embeddings for fast approximate nearest-neighbor (ANN) search
- The two are joined by
memory_id(UUID) for hybrid retrieval
Memory Schema
Each memory entry is stored in the _hatidata_agent_memory table:
| Column | Type | Description |
|---|---|---|
memory_id | UUID | Primary key, joins with vector embeddings |
org_id | VARCHAR | Organization identifier |
agent_id | VARCHAR | Agent that created the memory |
session_id | VARCHAR | Conversation or task session grouping |
memory_type | VARCHAR | Category: fact, episode, preference, context |
content | TEXT | The memory content (natural language or structured data) |
metadata | JSON | Arbitrary key-value metadata |
importance | FLOAT | Importance score (0.0 to 1.0) |
has_embedding | BOOLEAN | Whether a vector embedding has been generated |
access_count | BIGINT | Number of times this memory has been retrieved |
last_accessed | TIMESTAMP | When this memory was last retrieved |
created_at | TIMESTAMP | When the memory was stored |
updated_at | TIMESTAMP | Last modification timestamp |
Architecture
Embedding Service
HatiData uses a pluggable embedding service to generate vector embeddings. In production, configure a provider that calls your embedding API (OpenAI, Cohere, etc.). A mock provider is included for testing.
Write Path
Storing a new memory happens in two phases:
- The record is inserted immediately with
has_embedding = false(fast — returns immediately) - The content is dispatched asynchronously for embedding generation
- Once the embedding is ready,
has_embeddingis set totrueand the vector is stored
Embedding requests are processed in configurable batches to minimize API calls. Failures are handled gracefully — memories remain searchable by metadata even if embedding generation fails.
Search Path
Search uses hybrid retrieval with graceful degradation:
Full hybrid mode (vector search + metadata):
- Embed the search query
- Approximate nearest-neighbor search for top-K candidate
memory_idvalues - Join candidates with metadata to apply filters (agent_id, session_id, memory_type)
- Return results ranked by vector similarity score
Metadata-only fallback (vector search unavailable):
- Fall back to SQL
LIKEand relevance heuristics - Less precise but fully functional
Agents never lose access to their memories, even if vector search is temporarily unavailable.
Access Tracking
Real-time access counts are maintained efficiently:
- Lock-free in-memory counters with periodic batch flush
- Tracks both
access_countandlast_accessed - Access patterns inform importance scoring and eviction
MCP Tools
store_memory
// Input
{
"content": "The customer prefers email communication over phone calls",
"memory_type": "preference",
"metadata": { "customer_id": "cust-42", "source": "support-ticket-789" },
"importance": 0.8
}
// Output
{
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "stored",
"has_embedding": false
}
search_memory
// Input
{
"query": "customer communication preferences",
"top_k": 10,
"memory_type": "preference",
"min_importance": 0.5
}
// Output
[{
"memory_id": "a1b2c3d4-...",
"content": "The customer prefers email communication over phone calls",
"memory_type": "preference",
"importance": 0.8,
"similarity_score": 0.94,
"created_at": "2025-01-15T10:30:00Z"
}]
get_agent_state
// Input
{ "key": "current_task" }
// Output
{
"key": "current_task",
"value": "Analyzing Q4 revenue data for the board presentation",
"updated_at": "2025-01-15T14:20:00Z"
}
Agent state is stored in a dedicated _hatidata_agent_state table, scoped by agent_id.
set_agent_state
{ "key": "current_task", "value": "Generating the final report" }
delete_memory
{ "memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890" }
Deletes the memory and its vector embedding. Irreversible.
Usage Examples
Python SDK
from hatidata_agent import HatiDataAgent
agent = HatiDataAgent(
host="your-org.proxy.hatidata.com",
agent_id="research-agent",
password="hd_live_your_api_key",
)
# Store a memory
agent.store_memory(
content="Revenue grew 15% in Q4 driven by enterprise segment",
memory_type="fact",
importance=0.9,
metadata={"quarter": "Q4", "metric": "revenue"},
)
# Search memories
results = agent.search_memory(
query="revenue growth trends",
top_k=5,
memory_type="fact",
)
for memory in results:
print(f"[{memory['importance']}] {memory['content']}")
# Agent state
agent.set_state("analysis_phase", "data_collection")
phase = agent.get_state("analysis_phase")
LangChain Integration
from langchain_hatidata import HatiDataMemory
memory = HatiDataMemory(
host="your-org.proxy.hatidata.com",
agent_id="langchain-agent",
password="hd_live_your_api_key",
session_id="conversation-123",
)
from langchain.chains import ConversationChain
chain = ConversationChain(llm=llm, memory=memory)
See the LangChain integration for full details.
Configuration
Memory behavior is configurable per deployment:
| Setting | Default | Description |
|---|---|---|
| Memory enabled | true | Enable/disable the memory subsystem |
| Embedding batch size | 32 | Max texts per embedding API call |
| Embedding flush interval | 1 second | Max wait before flushing a partial batch |
| Max search results | 100 | Maximum results per search query |
| Access tracker flush interval | 60 seconds | How often access counts are persisted |
Related Concepts
- Hybrid SQL — Combine memory search with structured SQL queries
- Chain-of-Thought Ledger — Immutable reasoning traces
- Semantic Triggers — Trigger actions based on memory content
- LangChain Integration — Using memory with LangChain agents