Skip to main content

Agent Memory

HatiData provides a long-term memory system for AI agents that combines vector similarity search with structured SQL metadata. Agents can store, search, and retrieve memories across sessions, enabling persistent context that survives restarts and can be shared between agents.

Architecture

The memory system uses a hybrid search architecture that combines structured SQL metadata with built-in vector search:

Agent → store_memory → HatiData → stored (metadata + vector embedding)

Agent → search_memory → HatiData → ranked results
  • Structured storage holds the authoritative memory records: content, metadata, agent identity, timestamps, and access counts.
  • Built-in vector search stores embeddings for fast approximate nearest-neighbor (ANN) search.
  • The two are joined by memory_id (UUID) for hybrid retrieval.

Memory Schema

Each memory entry is stored in the _hatidata_agent_memory table:

ColumnTypeDescription
memory_idUUIDPrimary key, used to join with vector embeddings
org_idVARCHAROrganization identifier
agent_idVARCHARAgent that created the memory
session_idVARCHARConversation or task session grouping
memory_typeVARCHARCategory: fact, episode, preference, context
contentTEXTThe memory content (natural language or structured data)
metadataJSONArbitrary key-value metadata
importanceFLOATImportance score (0.0 to 1.0)
has_embeddingBOOLEANWhether a vector embedding has been generated
access_countBIGINTNumber of times this memory has been retrieved
last_accessedTIMESTAMPWhen this memory was last retrieved
created_atTIMESTAMPWhen the memory was stored
updated_atTIMESTAMPLast modification timestamp

Core Components

EmbeddingProvider

The EmbeddingProvider trait defines the interface for generating vector embeddings:

pub trait EmbeddingProvider: Send + Sync {
/// Embed a batch of texts into vectors
async fn embed_batch(&self, texts: &[String]) -> Result<Vec<Vec<f32>>>;

/// Embed a single text (default: calls embed_batch with one item)
async fn embed(&self, text: &str) -> Result<Vec<f32>>;

/// Vector dimensionality
fn dimensions(&self) -> usize;

/// Model name for logging and metadata
fn model_name(&self) -> &str;
}

HatiData includes a MockEmbeddingProvider for testing that generates deterministic vectors using SHA-256 seeded RNG. In production, configure an embedding provider that calls your preferred embedding API (OpenAI, Cohere, etc.).

MemoryWriter

The MemoryWriter handles storing new memories:

  1. Inserts the memory record into DuckDB with has_embedding = false
  2. Dispatches the content to the EmbeddingWorker for async embedding
  3. Once the embedding is generated, updates has_embedding = true and stores the vector for search

This two-phase approach means writes are fast (DuckDB insert completes immediately) while embeddings are generated asynchronously in the background.

EmbeddingWorker

The EmbeddingWorker processes embedding requests in configurable batches:

  • Uses async_channel for the request queue
  • Batches multiple texts into a single embedding API call for efficiency
  • Handles failures gracefully: memories are still searchable by metadata even if embedding fails
  • Configurable batch size and flush interval

MemoryReader

The MemoryReader implements hybrid search with graceful degradation:

Full hybrid mode (vector search + metadata):

  1. Embed the search query using the EmbeddingProvider
  2. Run ANN search to get top-K candidate memory_id values
  3. JOIN candidates with metadata to apply filters (agent_id, session_id, memory_type)
  4. Return results ranked by vector similarity score

Metadata-only fallback (vector search unavailable):

  1. Fall back to metadata-based search using SQL LIKE and ORDER BY on relevance heuristics
  2. Results are less precise but the system remains functional

This graceful fallback means agents never lose access to their memories, even if the vector search engine is temporarily unavailable.

AccessTracker

The AccessTracker maintains real-time access counts without write amplification:

  • Uses DashMap<Uuid, AtomicU64> for lock-free in-memory counters
  • Periodically flushes accumulated counts to DuckDB in batch UPDATE statements
  • Tracks both access_count and last_accessed timestamps
  • Access patterns can be used for memory importance scoring and eviction

MCP Tools

The memory system exposes five tools through the MCP protocol:

store_memory

Store a new memory entry.

Input:

{
"content": "The customer prefers email communication over phone calls",
"memory_type": "preference",
"metadata": {
"customer_id": "cust-42",
"source": "support-ticket-789"
},
"importance": 0.8
}

Output:

{
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "stored",
"has_embedding": false
}

The has_embedding field will become true once the async embedding worker processes the entry (typically within seconds).

search_memory

Search memories using natural language or keyword queries.

Input:

{
"query": "customer communication preferences",
"top_k": 10,
"memory_type": "preference",
"min_importance": 0.5
}

Output:

[
{
"memory_id": "a1b2c3d4-...",
"content": "The customer prefers email communication over phone calls",
"memory_type": "preference",
"importance": 0.8,
"similarity_score": 0.94,
"created_at": "2025-01-15T10:30:00Z"
}
]

When vector search is available, similarity_score is a cosine similarity value. In fallback mode, it is a heuristic relevance score.

get_agent_state

Retrieve a named state value for the current agent.

Input:

{
"key": "current_task"
}

Output:

{
"key": "current_task",
"value": "Analyzing Q4 revenue data for the board presentation",
"updated_at": "2025-01-15T14:20:00Z"
}

Agent state is stored as key-value pairs in a dedicated _hatidata_agent_state table, scoped by agent_id.

set_agent_state

Set or update a named state value.

Input:

{
"key": "current_task",
"value": "Generating the final report"
}

delete_memory

Delete a specific memory entry by ID.

Input:

{
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}

Deletes the memory and its vector embedding. This operation is irreversible.

Usage Examples

Python SDK

from hatidata_agent import HatiDataAgent

agent = HatiDataAgent(
host="your-org.proxy.hatidata.com",
agent_id="research-agent",
password="hd_live_your_api_key",
)

# Store a memory
agent.store_memory(
content="Revenue grew 15% in Q4 driven by enterprise segment",
memory_type="fact",
importance=0.9,
metadata={"quarter": "Q4", "metric": "revenue"},
)

# Search memories
results = agent.search_memory(
query="revenue growth trends",
top_k=5,
memory_type="fact",
)
for memory in results:
print(f"[{memory['importance']}] {memory['content']}")

# Agent state
agent.set_state("analysis_phase", "data_collection")
phase = agent.get_state("analysis_phase")

LangChain Integration

from langchain_hatidata import HatiDataMemory

memory = HatiDataMemory(
host="your-org.proxy.hatidata.com",
agent_id="langchain-agent",
password="hd_live_your_api_key",
session_id="conversation-123",
)

# Memory is automatically stored/loaded by LangChain chains
from langchain.chains import ConversationChain
chain = ConversationChain(llm=llm, memory=memory)

See the LangChain integration page for full details.

Configuration

Memory system behavior is controlled by environment variables on the proxy:

VariableDefaultDescription
HATIDATA_MEMORY_ENABLEDtrueEnable/disable the memory subsystem
HATIDATA_EMBEDDING_BATCH_SIZE32Max texts per embedding API call
HATIDATA_EMBEDDING_FLUSH_INTERVAL_MS1000Max wait before flushing a partial batch
HATIDATA_MEMORY_MAX_RESULTS100Maximum results per search query
HATIDATA_ACCESS_TRACKER_FLUSH_INTERVAL_SECS60How often access counts are flushed to DuckDB

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.