Agent Memory
HatiData provides a long-term memory system for AI agents that combines vector similarity search with structured SQL metadata. Agents can store, search, and retrieve memories across sessions, enabling persistent context that survives restarts and can be shared between agents.
Architecture
The memory system uses a hybrid search architecture that combines structured SQL metadata with built-in vector search:
Agent → store_memory → HatiData → stored (metadata + vector embedding)
Agent → search_memory → HatiData → ranked results
- Structured storage holds the authoritative memory records: content, metadata, agent identity, timestamps, and access counts.
- Built-in vector search stores embeddings for fast approximate nearest-neighbor (ANN) search.
- The two are joined by
memory_id(UUID) for hybrid retrieval.
Memory Schema
Each memory entry is stored in the _hatidata_agent_memory table:
| Column | Type | Description |
|---|---|---|
memory_id | UUID | Primary key, used to join with vector embeddings |
org_id | VARCHAR | Organization identifier |
agent_id | VARCHAR | Agent that created the memory |
session_id | VARCHAR | Conversation or task session grouping |
memory_type | VARCHAR | Category: fact, episode, preference, context |
content | TEXT | The memory content (natural language or structured data) |
metadata | JSON | Arbitrary key-value metadata |
importance | FLOAT | Importance score (0.0 to 1.0) |
has_embedding | BOOLEAN | Whether a vector embedding has been generated |
access_count | BIGINT | Number of times this memory has been retrieved |
last_accessed | TIMESTAMP | When this memory was last retrieved |
created_at | TIMESTAMP | When the memory was stored |
updated_at | TIMESTAMP | Last modification timestamp |
Core Components
EmbeddingProvider
The EmbeddingProvider trait defines the interface for generating vector embeddings:
pub trait EmbeddingProvider: Send + Sync {
/// Embed a batch of texts into vectors
async fn embed_batch(&self, texts: &[String]) -> Result<Vec<Vec<f32>>>;
/// Embed a single text (default: calls embed_batch with one item)
async fn embed(&self, text: &str) -> Result<Vec<f32>>;
/// Vector dimensionality
fn dimensions(&self) -> usize;
/// Model name for logging and metadata
fn model_name(&self) -> &str;
}
HatiData includes a MockEmbeddingProvider for testing that generates deterministic vectors using SHA-256 seeded RNG. In production, configure an embedding provider that calls your preferred embedding API (OpenAI, Cohere, etc.).
MemoryWriter
The MemoryWriter handles storing new memories:
- Inserts the memory record into DuckDB with
has_embedding = false - Dispatches the content to the
EmbeddingWorkerfor async embedding - Once the embedding is generated, updates
has_embedding = trueand stores the vector for search
This two-phase approach means writes are fast (DuckDB insert completes immediately) while embeddings are generated asynchronously in the background.
EmbeddingWorker
The EmbeddingWorker processes embedding requests in configurable batches:
- Uses
async_channelfor the request queue - Batches multiple texts into a single embedding API call for efficiency
- Handles failures gracefully: memories are still searchable by metadata even if embedding fails
- Configurable batch size and flush interval
MemoryReader
The MemoryReader implements hybrid search with graceful degradation:
Full hybrid mode (vector search + metadata):
- Embed the search query using the
EmbeddingProvider - Run ANN search to get top-K candidate
memory_idvalues - JOIN candidates with metadata to apply filters (agent_id, session_id, memory_type)
- Return results ranked by vector similarity score
Metadata-only fallback (vector search unavailable):
- Fall back to metadata-based search using SQL
LIKEandORDER BYon relevance heuristics - Results are less precise but the system remains functional
This graceful fallback means agents never lose access to their memories, even if the vector search engine is temporarily unavailable.
AccessTracker
The AccessTracker maintains real-time access counts without write amplification:
- Uses
DashMap<Uuid, AtomicU64>for lock-free in-memory counters - Periodically flushes accumulated counts to DuckDB in batch
UPDATEstatements - Tracks both
access_countandlast_accessedtimestamps - Access patterns can be used for memory importance scoring and eviction
MCP Tools
The memory system exposes five tools through the MCP protocol:
store_memory
Store a new memory entry.
Input:
{
"content": "The customer prefers email communication over phone calls",
"memory_type": "preference",
"metadata": {
"customer_id": "cust-42",
"source": "support-ticket-789"
},
"importance": 0.8
}
Output:
{
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "stored",
"has_embedding": false
}
The has_embedding field will become true once the async embedding worker processes the entry (typically within seconds).
search_memory
Search memories using natural language or keyword queries.
Input:
{
"query": "customer communication preferences",
"top_k": 10,
"memory_type": "preference",
"min_importance": 0.5
}
Output:
[
{
"memory_id": "a1b2c3d4-...",
"content": "The customer prefers email communication over phone calls",
"memory_type": "preference",
"importance": 0.8,
"similarity_score": 0.94,
"created_at": "2025-01-15T10:30:00Z"
}
]
When vector search is available, similarity_score is a cosine similarity value. In fallback mode, it is a heuristic relevance score.
get_agent_state
Retrieve a named state value for the current agent.
Input:
{
"key": "current_task"
}
Output:
{
"key": "current_task",
"value": "Analyzing Q4 revenue data for the board presentation",
"updated_at": "2025-01-15T14:20:00Z"
}
Agent state is stored as key-value pairs in a dedicated _hatidata_agent_state table, scoped by agent_id.
set_agent_state
Set or update a named state value.
Input:
{
"key": "current_task",
"value": "Generating the final report"
}
delete_memory
Delete a specific memory entry by ID.
Input:
{
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
Deletes the memory and its vector embedding. This operation is irreversible.
Usage Examples
Python SDK
from hatidata_agent import HatiDataAgent
agent = HatiDataAgent(
host="your-org.proxy.hatidata.com",
agent_id="research-agent",
password="hd_live_your_api_key",
)
# Store a memory
agent.store_memory(
content="Revenue grew 15% in Q4 driven by enterprise segment",
memory_type="fact",
importance=0.9,
metadata={"quarter": "Q4", "metric": "revenue"},
)
# Search memories
results = agent.search_memory(
query="revenue growth trends",
top_k=5,
memory_type="fact",
)
for memory in results:
print(f"[{memory['importance']}] {memory['content']}")
# Agent state
agent.set_state("analysis_phase", "data_collection")
phase = agent.get_state("analysis_phase")
LangChain Integration
from langchain_hatidata import HatiDataMemory
memory = HatiDataMemory(
host="your-org.proxy.hatidata.com",
agent_id="langchain-agent",
password="hd_live_your_api_key",
session_id="conversation-123",
)
# Memory is automatically stored/loaded by LangChain chains
from langchain.chains import ConversationChain
chain = ConversationChain(llm=llm, memory=memory)
See the LangChain integration page for full details.
Configuration
Memory system behavior is controlled by environment variables on the proxy:
| Variable | Default | Description |
|---|---|---|
HATIDATA_MEMORY_ENABLED | true | Enable/disable the memory subsystem |
HATIDATA_EMBEDDING_BATCH_SIZE | 32 | Max texts per embedding API call |
HATIDATA_EMBEDDING_FLUSH_INTERVAL_MS | 1000 | Max wait before flushing a partial batch |
HATIDATA_MEMORY_MAX_RESULTS | 100 | Maximum results per search query |
HATIDATA_ACCESS_TRACKER_FLUSH_INTERVAL_SECS | 60 | How often access counts are flushed to DuckDB |
Next Steps
- Chain-of-Thought Ledger -- Immutable reasoning traces
- Semantic Triggers -- Event-driven automation based on memory content
- LangChain Integration -- Using memory with LangChain agents