Agent Memory

HatiData provides a long-term memory system for AI agents that combines vector similarity search with structured SQL metadata. Agents can store, search, and retrieve memories across sessions, enabling persistent context that survives restarts and can be shared between agents.

Architecture

The memory system uses a hybrid search architecture that combines structured SQL metadata with built-in vector search:

Agent → store_memory → HatiData → stored (metadata + vector embedding)

Agent → search_memory → HatiData → ranked results

Structured storage holds the authoritative memory records: content, metadata, agent identity, timestamps, and access counts.
Built-in vector search stores embeddings for fast approximate nearest-neighbor (ANN) search.
The two are joined by memory_id (UUID) for hybrid retrieval.

Memory Schema

Each memory entry is stored in the _hatidata_agent_memory table:

Column	Type	Description
`memory_id`	UUID	Primary key, used to join with vector embeddings
`org_id`	VARCHAR	Organization identifier
`agent_id`	VARCHAR	Agent that created the memory
`session_id`	VARCHAR	Conversation or task session grouping
`memory_type`	VARCHAR	Category: `fact`, `episode`, `preference`, `context`
`content`	TEXT	The memory content (natural language or structured data)
`metadata`	JSON	Arbitrary key-value metadata
`importance`	FLOAT	Importance score (0.0 to 1.0)
`has_embedding`	BOOLEAN	Whether a vector embedding has been generated
`access_count`	BIGINT	Number of times this memory has been retrieved
`last_accessed`	TIMESTAMP	When this memory was last retrieved
`created_at`	TIMESTAMP	When the memory was stored
`updated_at`	TIMESTAMP	Last modification timestamp

Core Components

EmbeddingProvider

The EmbeddingProvider trait defines the interface for generating vector embeddings:

pub trait EmbeddingProvider: Send + Sync {
    /// Embed a batch of texts into vectors
    async fn embed_batch(&self, texts: &[String]) -> Result<Vec<Vec<f32>>>;

    /// Embed a single text (default: calls embed_batch with one item)
    async fn embed(&self, text: &str) -> Result<Vec<f32>>;

    /// Vector dimensionality
    fn dimensions(&self) -> usize;

    /// Model name for logging and metadata
    fn model_name(&self) -> &str;
}

HatiData includes a MockEmbeddingProvider for testing that generates deterministic vectors using SHA-256 seeded RNG. In production, configure an embedding provider that calls your preferred embedding API (OpenAI, Cohere, etc.).

MemoryWriter

The MemoryWriter handles storing new memories:

Inserts the memory record into DuckDB with has_embedding = false
Dispatches the content to the EmbeddingWorker for async embedding
Once the embedding is generated, updates has_embedding = true and stores the vector for search

This two-phase approach means writes are fast (DuckDB insert completes immediately) while embeddings are generated asynchronously in the background.

EmbeddingWorker

The EmbeddingWorker processes embedding requests in configurable batches:

Uses async_channel for the request queue
Batches multiple texts into a single embedding API call for efficiency
Handles failures gracefully: memories are still searchable by metadata even if embedding fails
Configurable batch size and flush interval

MemoryReader

The MemoryReader implements hybrid search with graceful degradation:

Full hybrid mode (vector search + metadata):

Embed the search query using the EmbeddingProvider
Run ANN search to get top-K candidate memory_id values
JOIN candidates with metadata to apply filters (agent_id, session_id, memory_type)
Return results ranked by vector similarity score

Metadata-only fallback (vector search unavailable):

Fall back to metadata-based search using SQL LIKE and ORDER BY on relevance heuristics
Results are less precise but the system remains functional

This graceful fallback means agents never lose access to their memories, even if the vector search engine is temporarily unavailable.

AccessTracker

The AccessTracker maintains real-time access counts without write amplification:

Uses DashMap<Uuid, AtomicU64> for lock-free in-memory counters
Periodically flushes accumulated counts to DuckDB in batch UPDATE statements
Tracks both access_count and last_accessed timestamps
Access patterns can be used for memory importance scoring and eviction

MCP Tools

The memory system exposes five tools through the MCP protocol:

`store_memory`

Store a new memory entry.

Input:

{
  "content": "The customer prefers email communication over phone calls",
  "memory_type": "preference",
  "metadata": {
    "customer_id": "cust-42",
    "source": "support-ticket-789"
  },
  "importance": 0.8
}

Output:

{
  "memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "stored",
  "has_embedding": false
}

The has_embedding field will become true once the async embedding worker processes the entry (typically within seconds).

`search_memory`

Search memories using natural language or keyword queries.

Input:

{
  "query": "customer communication preferences",
  "top_k": 10,
  "memory_type": "preference",
  "min_importance": 0.5
}

Output:

[
  {
    "memory_id": "a1b2c3d4-...",
    "content": "The customer prefers email communication over phone calls",
    "memory_type": "preference",
    "importance": 0.8,
    "similarity_score": 0.94,
    "created_at": "2025-01-15T10:30:00Z"
  }
]

When vector search is available, similarity_score is a cosine similarity value. In fallback mode, it is a heuristic relevance score.

`get_agent_state`

Retrieve a named state value for the current agent.

Input:

{
  "key": "current_task"
}

Output:

{
  "key": "current_task",
  "value": "Analyzing Q4 revenue data for the board presentation",
  "updated_at": "2025-01-15T14:20:00Z"
}

Agent state is stored as key-value pairs in a dedicated _hatidata_agent_state table, scoped by agent_id.

`set_agent_state`

Set or update a named state value.

Input:

{
  "key": "current_task",
  "value": "Generating the final report"
}

`delete_memory`

Delete a specific memory entry by ID.

Input:

{
  "memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}

Deletes the memory and its vector embedding. This operation is irreversible.

Usage Examples

Python SDK

from hatidata_agent import HatiDataAgent

agent = HatiDataAgent(
    host="your-org.proxy.hatidata.com",
    agent_id="research-agent",
    password="hd_live_your_api_key",
)

# Store a memory
agent.store_memory(
    content="Revenue grew 15% in Q4 driven by enterprise segment",
    memory_type="fact",
    importance=0.9,
    metadata={"quarter": "Q4", "metric": "revenue"},
)

# Search memories
results = agent.search_memory(
    query="revenue growth trends",
    top_k=5,
    memory_type="fact",
)
for memory in results:
    print(f"[{memory['importance']}] {memory['content']}")

# Agent state
agent.set_state("analysis_phase", "data_collection")
phase = agent.get_state("analysis_phase")

LangChain Integration

from langchain_hatidata import HatiDataMemory

memory = HatiDataMemory(
    host="your-org.proxy.hatidata.com",
    agent_id="langchain-agent",
    password="hd_live_your_api_key",
    session_id="conversation-123",
)

# Memory is automatically stored/loaded by LangChain chains
from langchain.chains import ConversationChain
chain = ConversationChain(llm=llm, memory=memory)

See the LangChain integration page for full details.

Configuration

Memory system behavior is controlled by environment variables on the proxy:

Variable	Default	Description
`HATIDATA_MEMORY_ENABLED`	`true`	Enable/disable the memory subsystem
`HATIDATA_EMBEDDING_BATCH_SIZE`	`32`	Max texts per embedding API call
`HATIDATA_EMBEDDING_FLUSH_INTERVAL_MS`	`1000`	Max wait before flushing a partial batch
`HATIDATA_MEMORY_MAX_RESULTS`	`100`	Maximum results per search query
`HATIDATA_ACCESS_TRACKER_FLUSH_INTERVAL_SECS`	`60`	How often access counts are flushed to DuckDB

Next Steps

Chain-of-Thought Ledger -- Immutable reasoning traces
Semantic Triggers -- Event-driven automation based on memory content
LangChain Integration -- Using memory with LangChain agents

Architecture​

Memory Schema​

Core Components​

EmbeddingProvider​

MemoryWriter​

EmbeddingWorker​

MemoryReader​

AccessTracker​

MCP Tools​

store_memory​

search_memory​

get_agent_state​

set_agent_state​

delete_memory​

Usage Examples​

Python SDK​

LangChain Integration​

Configuration​

Next Steps​

Stay in the loop

Architecture

Memory Schema

Core Components

EmbeddingProvider

MemoryWriter

EmbeddingWorker

MemoryReader

AccessTracker

MCP Tools

`store_memory`

`search_memory`

`get_agent_state`

`set_agent_state`

`delete_memory`

Usage Examples

Python SDK

LangChain Integration

Configuration

Next Steps