Python SDK
The hatidata-agent package provides an agent-aware Python client that connects to HatiData via the Postgres wire protocol. Agents identify themselves through startup parameters, enabling per-agent billing, scheduling, and audit trails.
Installation
pip install hatidata-agent
Requirements: Python 3.9+. The package includes psycopg2-binary for Postgres connectivity.
Optional extras:
# LangChain integration
pip install hatidata-agent[langchain]
# MCP server
pip install hatidata-agent[mcp]
# Async support (asyncpg)
pip install hatidata-agent[async]
# Everything
pip install hatidata-agent[all]
Basic Usage
Connect and Query
from hatidata_agent import HatiDataAgent
agent = HatiDataAgent(
host="localhost", # Proxy hostname
port=5439, # Proxy port (default 5439)
agent_id="my-agent", # Unique agent identifier
framework="custom", # AI framework (langchain, crewai, autogen, etc.)
database="hatidata", # Database name
user="agent", # Username
password="", # Password or API key
)
# SELECT query -- returns list of dicts
rows = agent.query("SELECT * FROM customers WHERE status = 'active' LIMIT 10")
for row in rows:
print(row["name"], row["email"])
# INSERT/UPDATE/DELETE -- returns affected row count
count = agent.execute("INSERT INTO logs VALUES (1, 'agent started', NOW())")
print(f"{count} rows inserted")
Context Manager
with HatiDataAgent(host="localhost", agent_id="my-agent") as agent:
rows = agent.query("SELECT COUNT(*) FROM orders")
# Connection is automatically closed when the block exits
Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
host | str | "localhost" | Proxy hostname |
port | int | 5439 | Proxy port |
agent_id | str | Auto-generated | Unique agent identifier |
framework | str | "custom" | AI framework name |
database | str | "hatidata" | Database name |
user | str | "agent" | Username |
password | str | "" | Password or API key |
priority | str | "normal" | Query priority (low, normal, high, critical) |
connect_timeout | int | 10 | Connection timeout in seconds |
The agent_id and framework parameters are passed to the proxy as startup parameters. The proxy uses them for billing, scheduling, and audit.
Reasoning Chains
Track multi-step reasoning with the reasoning_chain context manager. Each query in the chain is tagged with a request ID and step number for end-to-end tracing:
agent = HatiDataAgent(host="localhost", agent_id="analyst-agent")
with agent.reasoning_chain("req-001") as chain:
# Step 0: Discover tables
tables = chain.query("SELECT table_name FROM information_schema.tables")
# Step 1: Explore data
data = chain.query("SELECT * FROM customers LIMIT 10", step=1)
# Step 2: Aggregate
summary = chain.query(
"SELECT status, COUNT(*) as cnt FROM customers GROUP BY status",
step=2,
)
Nested reasoning chains are supported via parent_request_id:
with agent.reasoning_chain("req-002", parent_request_id="req-001") as sub_chain:
detail = sub_chain.query("SELECT * FROM orders WHERE customer_id = 42")
RAG Context Retrieval
HatiData provides built-in context retrieval functions for RAG workflows:
Full-Text Search
# Search for relevant rows using natural language
context = agent.get_context(
table="knowledge_base",
search_query="enterprise pricing tiers",
top_k=5,
)
# Returns: [{"id": 1, "content": "...", "title": "..."}, ...]
Vector Similarity Search
# Search by embedding vector
context = agent.get_rag_context(
table="documents",
embedding_col="embedding",
vector=[0.1, 0.2, 0.3, ...], # Your query embedding
top_k=10,
)
Parameterized Queries
Use parameterized queries to prevent SQL injection:
rows = agent.query(
"SELECT * FROM orders WHERE customer_id = %s AND status = %s",
params=(42, "active"),
)
Request Tracing
Tag individual queries with a request ID for tracing:
rows = agent.query(
"SELECT * FROM orders",
request_id="trace-abc-123",
)
The request ID appears in the audit log, allowing you to correlate agent actions with specific queries.
Methods Reference
| Method | Returns | Description |
|---|---|---|
query(sql, params, request_id) | list[dict] | Execute SELECT, return rows as dicts |
execute(sql, params) | int | Execute INSERT/UPDATE/DELETE, return row count |
get_context(table, search_query, top_k) | list[dict] | Full-text search context retrieval |
get_rag_context(table, embedding_col, vector, top_k) | list[dict] | Vector similarity search |
reasoning_chain(request_id, parent_request_id) | Context manager | Multi-step reasoning tracking |
close() | None | Close the database connection |
Using with Standard psycopg2
Since HatiData speaks the Postgres wire protocol, you can also use psycopg2 directly:
import psycopg2
conn = psycopg2.connect(
host="localhost",
port=5439,
user="agent",
password="",
dbname="hatidata",
application_name="my-agent/custom", # agent_id/framework
)
with conn.cursor() as cur:
cur.execute("SELECT * FROM customers LIMIT 10")
rows = cur.fetchall()
conn.close()
Source Code
Next Steps
- TypeScript SDK -- Node.js client
- dbt Adapter -- Run dbt models against HatiData
- Agent Integrations -- LangChain, CrewAI, Autogen
- MCP Server -- Model Context Protocol integration