LlamaIndex Integration
HatiData serves as a SQL-backed retrieval layer for LlamaIndex, combining the framework's query engine abstractions with HatiData's hybrid SQL + vector search. Documents are stored in HatiData tables with automatic embedding indexing, giving you full SQL queryability alongside semantic retrieval.
Installation
pip install hatidata llama-index
Requirements: Python 3.10+, a running HatiData proxy (local or cloud).
export HATIDATA_API_KEY="hd_live_your_api_key"
export HATIDATA_HOST="localhost"
export OPENAI_API_KEY="sk-..."
SQL-Based Document Store
Store LlamaIndex documents directly in HatiData tables. Unlike filesystem or in-memory stores, documents persist across restarts and are queryable with SQL.
import os
from hatidata import HatiDataClient
client = HatiDataClient(
host=os.environ["HATIDATA_HOST"],
port=5439,
api_key=os.environ["HATIDATA_API_KEY"],
)
# Create a document table
client.execute("""
CREATE TABLE IF NOT EXISTS documents (
doc_id TEXT PRIMARY KEY,
content TEXT NOT NULL,
source TEXT,
category TEXT,
created_at TIMESTAMPTZ DEFAULT now()
)
""")
# Insert documents
docs = [
("doc-001", "HatiData uses a high-performance query engine...", "docs", "architecture"),
("doc-002", "Semantic triggers fire when content similarity exceeds a threshold...", "docs", "features"),
("doc-003", "Branch isolation creates copy-on-write schema branches...", "docs", "features"),
]
client.executemany(
"INSERT INTO documents (doc_id, content, source, category) VALUES (?, ?, ?, ?)",
docs,
)
QueryEngine with Hybrid Retrieval
Build a LlamaIndex QueryEngine that retrieves context from HatiData using hybrid SQL + vector search, then synthesizes an answer with an LLM.
from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import BaseRetriever
from llama_index.core.schema import NodeWithScore, TextNode, QueryBundle
from llama_index.llms.openai import OpenAI
class HatiDataRetriever(BaseRetriever):
"""Custom retriever that uses HatiData's semantic_match for hybrid search."""
def __init__(self, client: HatiDataClient, table: str, top_k: int = 5):
super().__init__()
self._client = client
self._table = table
self._top_k = top_k
def _retrieve(self, query_bundle: QueryBundle) -> list[NodeWithScore]:
query_text = query_bundle.query_str.replace("'", "''")
rows = self._client.query(f"""
SELECT doc_id, content, source, category,
semantic_rank(content, '{query_text}') AS relevance
FROM {self._table}
WHERE semantic_match(content, '{query_text}', 0.65)
ORDER BY relevance DESC
LIMIT {self._top_k}
""")
nodes = []
for row in rows:
node = TextNode(
text=row["content"],
id_=row["doc_id"],
metadata={"source": row["source"], "category": row["category"]},
)
nodes.append(NodeWithScore(node=node, score=row["relevance"]))
return nodes
# Build the query engine
Settings.llm = OpenAI(model="gpt-4o", temperature=0)
retriever = HatiDataRetriever(client, table="documents", top_k=5)
query_engine = RetrieverQueryEngine.from_args(retriever)
response = query_engine.query("How does branch isolation work?")
print(response)
Index Configuration
Embedding Model Selection
HatiData's proxy handles embedding internally when you use semantic_match() and semantic_rank(). The configured embedding provider (OpenAI or custom) generates vectors at query time. You do not need to configure a separate embedding model in LlamaIndex for the retrieval step.
For ingestion-time embedding (pre-computing vectors at insert), use HatiData's memory API:
from hatidata.memory import MemoryClient
memory = MemoryClient(client)
# Store documents with automatic embedding
for doc_id, content, source, category in docs:
memory.store(
agent_id="llamaindex-agent",
content=content,
metadata={
"doc_id": doc_id,
"source": source,
"category": category,
},
)
Metadata Filtering
Combine semantic search with SQL metadata filters for precise retrieval:
class FilteredHatiDataRetriever(BaseRetriever):
"""Retriever with SQL metadata filtering."""
def __init__(self, client, table, category=None, top_k=5):
super().__init__()
self._client = client
self._table = table
self._category = category
self._top_k = top_k
def _retrieve(self, query_bundle: QueryBundle) -> list[NodeWithScore]:
query_text = query_bundle.query_str.replace("'", "''")
category_clause = (
f"AND category = '{self._category}'" if self._category else ""
)
rows = self._client.query(f"""
SELECT doc_id, content, source, category,
semantic_rank(content, '{query_text}') AS relevance
FROM {self._table}
WHERE semantic_match(content, '{query_text}', 0.65)
{category_clause}
ORDER BY relevance DESC
LIMIT {self._top_k}
""")
return [
NodeWithScore(
node=TextNode(text=r["content"], id_=r["doc_id"]),
score=r["relevance"],
)
for r in rows
]
# Only search architecture documents
retriever = FilteredHatiDataRetriever(
client, table="documents", category="architecture", top_k=3
)
Multi-Index Pattern
Use multiple HatiData tables as separate indices for different document collections, then merge results in a single query engine.
from llama_index.core.retrievers import QueryFusionRetriever
docs_retriever = HatiDataRetriever(client, table="documents", top_k=3)
kb_retriever = HatiDataRetriever(client, table="knowledge_base", top_k=3)
fusion_retriever = QueryFusionRetriever(
[docs_retriever, kb_retriever],
num_queries=1,
use_async=False,
)
query_engine = RetrieverQueryEngine.from_args(fusion_retriever)
response = query_engine.query("What security features are available?")
Persisting Chat History
Store LlamaIndex chat engine history in HatiData for cross-session persistence:
from llama_index.core.chat_engine import SimpleChatEngine
def save_chat_history(session_id: str, messages: list):
for msg in messages:
memory.store(
agent_id="llamaindex-chat",
content=f"{msg.role}: {msg.content}",
metadata={"session_id": session_id, "role": msg.role},
)
def load_chat_history(session_id: str) -> list:
return memory.search(
agent_id="llamaindex-chat",
query="",
filters={"session_id": session_id},
top_k=50,
min_score=0.0,
)
Related Concepts
- Hybrid SQL --
semantic_matchandsemantic_rankreference - Persistent Memory -- Memory storage and retrieval
- SQL Functions & Types -- Full SQL function reference
- LangChain Integration -- Alternative framework integration
- Python SDK -- Direct SDK usage without LlamaIndex