Skip to main content

Agentic RAG

Build a retrieval-augmented generation system where AI agents query structured data via SQL and semantic memory through a single HatiData connection. This playbook demonstrates how to combine SQL queries with vector-based memory search to give agents persistent, context-aware reasoning.

Architecture

+---------------+     MCP / SQL      +----------------+
| AI Agent | ------------------> | HatiData |
| (Python) | | (SQL + Memory)|
+---------------+ +----------------+
|
Query Results

The agent connects to HatiData via the Postgres wire protocol or MCP. Structured queries are executed by the SQL engine. Semantic memory operations use built-in vector search, with results joined back through structured metadata for hybrid SQL + memory queries.

What's Included

  • HatiData Proxy -- Postgres-compatible query engine with built-in agent memory and vector search
  • Demo App -- Python RAG application demonstrating the full workflow

Quick Start

# Clone and start
git clone https://github.com/marviy/hatidata.git
cd hatidata/playbooks/agentic-rag
docker compose up -d

# Wait for services to be healthy
docker compose ps

# Run the demo
docker compose exec demo-app python app.py

What the Demo Does

  1. Loads sample data -- Inserts sales and customer data into HatiData tables
  2. Stores memories -- The agent stores analysis results as tagged memories with importance scores
  3. Queries with context -- The agent retrieves relevant memories to augment SQL query results
  4. Shows reasoning -- Each reasoning step is recorded in the chain-of-thought ledger

Key Concepts

Storing Memories

Agents can store observations, insights, and analysis results as persistent memories. Each memory is tagged for retrieval and assigned an importance score.

# Store a memory with tags and importance
client.call_tool("store_memory", {
"content": "Q4 APAC revenue declined 12% due to enterprise churn",
"tags": ["revenue", "apac", "q4", "churn"],
"importance": 0.9,
"ttl_hours": 720 # 30 days
})

Memories are stored with structured metadata and vector embeddings. The embedding is generated asynchronously -- typically available for search within 1 second of storage.

Search memories by meaning rather than exact keyword matching. The query is embedded and compared against stored memory vectors using cosine similarity.

# Search memories by meaning, not keywords
results = client.call_tool("search_memory", {
"query": "What happened to Asian Pacific revenue?",
"top_k": 5,
"min_similarity": 0.7
})

This query would match the memory stored above even though it uses different wording ("Asian Pacific" vs "APAC", no mention of "churn").

Combined SQL + Memory Queries

The real power of agentic RAG is combining SQL query results with memory context. The agent can:

  1. Run a SQL query to get current data
  2. Search memories for historical context and prior analysis
  3. Reason over both to produce informed conclusions
# Agent combines SQL results with memory context
sql_result = client.query(
"SELECT region, SUM(revenue) FROM sales GROUP BY region"
)
memory_context = client.call_tool("search_memory", {
"query": "revenue trends by region"
})
# Agent reasons over both to produce informed analysis

Memory Lifecycle

ParameterDescription
contentText content of the memory (required)
tagsList of string tags for filtering (optional)
importanceFloat 0.0-1.0 indicating priority (optional, default 0.5)
ttl_hoursTime-to-live in hours before automatic expiry (optional)

Memories can be:

  • Searched -- By semantic similarity or tag filters
  • Updated -- By storing a new memory with the same content (deduplication)
  • Deleted -- Explicitly via the delete_memory MCP tool
  • Expired -- Automatically after the TTL period

Use Cases

Use CaseHow It Works
Financial analysisAgent queries revenue tables, stores trend observations, retrieves context on subsequent runs
Customer supportAgent queries ticket history via SQL, searches memory for prior resolution patterns
Data explorationAgent stores schema insights and query patterns for faster future analysis
ResearchAgent accumulates findings across multiple query sessions

Configuration

VariableDefaultDescription
HATIDATA_DEV_MODEtrueSkip auth for local development

Cleanup

docker compose down -v  # Remove containers and volumes

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.