Docker Quickstart Recipes
Three copy-paste Docker Compose setups for common agent scenarios. Each recipe includes a docker-compose.yml, .env file, startup command, and a verification query to confirm everything is working.
Prerequisites
- Docker and Docker Compose v2 installed
- A HatiData API key
Getting Your API Key
HatiData always requires authentication -- there is no anonymous or unauthenticated mode, even for local development. Every connection to the SQL proxy (port 5439) and MCP server (port 5440) must include a valid API key.
Option A: Sign Up for a Free Account (Recommended)
- Create an account at hatidata.com/signup
- Open the dashboard at app.hatidata.com
- Navigate to API Keys in the sidebar
- Click Create Key and choose the
developerscope - Copy the key (it starts with
hd_live_orhd_test_) -- you will only see it once
Option B: Use the Full Dev Stack with Pre-Seeded Keys
If you clone the HatiData repository and run the full dev stack (cd dev && make up), the control plane auto-seeds demo API keys on first startup:
| Key | Org | Tier |
|---|---|---|
hd_agent_DemoFreeAgentstartLabs000000001 | AgentStart Labs | Free |
hd_agent_DemoCloudNexaflowAI000000000001 | NexaFlow AI | Cloud |
hd_agent_DemoGrowthVelocityData000000001 | Velocity Data Co | Growth |
hd_agent_DemoEnterpriseSentinelFin000001 | Global Sentinel Financial | Enterprise |
These keys are validated by the control plane using Argon2id hashes -- they are real keys, not dev bypass tokens.
The Docker Compose recipes below use a simplified single-container setup. Replace ${HATIDATA_API_KEY} in the .env file with your actual key from Option A or Option B. Placeholder values like hd_dev_* will be rejected -- all keys must be valid 40-character strings with a recognized prefix (hd_live_, hd_test_, or hd_agent_).
Recipe 1: Support Agent with Memory + Triggers
A support agent that remembers past interactions and fires a webhook when it detects frustrated customers.
docker-compose.yml
version: "3.8"
services:
hatidata-proxy:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5439:5439"
environment:
- HATIDATA_LISTEN_ADDR=0.0.0.0:5439
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_EMBEDDING_ENABLED=true
- HATIDATA_EMBEDDING_MODEL=bge-small-en-v1.5
- HATIDATA_TRIGGERS_ENABLED=true
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-support-data:/data
healthcheck:
test: ["CMD", "pg_isready", "-h", "localhost", "-p", "5439"]
interval: 5s
timeout: 3s
retries: 10
embedding-service:
image: ghcr.io/hatidata/embedding-service:latest
ports:
- "8090:8090"
environment:
- MODEL_NAME=bge-small-en-v1.5
- LISTEN_ADDR=0.0.0.0:8090
volumes:
hatidata-support-data:
.env
# Paste your real API key here (from dashboard or dev seed)
HATIDATA_API_KEY=hd_live_your_actual_key_from_dashboard
Start
docker compose up -d
Wait for the health check to pass (about 10 seconds), then verify:
Verify
# Connect and store a memory
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
INSERT INTO _hatidata_agent_memory (agent_id, content, metadata)
VALUES (
'support-agent',
'Customer reported billing issue with invoice #4521. Resolved by issuing credit.',
'{\"ticket_id\": \"T-4521\", \"category\": \"billing\", \"resolution\": \"credit_issued\"}'
);
"
# Search by semantic similarity
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT content, semantic_match(content, 'billing problem') AS similarity
FROM _hatidata_agent_memory
WHERE agent_id = 'support-agent'
ORDER BY semantic_rank(content, 'billing problem')
LIMIT 5;
"
# Register a trigger for frustrated customers
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT register_trigger(
'frustration-detector',
'support-agent',
'customer is angry, frustrated, or threatening to cancel',
0.75,
'webhook',
'{\"url\": \"http://host.docker.internal:9000/alerts\"}'
);
"
If the SELECT returns your stored memory with a similarity score, the setup is working. See the Support Agent Tutorial for the full agent implementation.
Recipe 2: Research Agent with Branching
A research agent that explores hypotheses in isolated branches, merging successful findings back to main.
docker-compose.yml
version: "3.8"
services:
hatidata-proxy:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5439:5439"
environment:
- HATIDATA_LISTEN_ADDR=0.0.0.0:5439
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_EMBEDDING_ENABLED=true
- HATIDATA_EMBEDDING_MODEL=bge-small-en-v1.5
- HATIDATA_BRANCHES_ENABLED=true
- HATIDATA_BRANCHES_MAX_PER_AGENT=10
- HATIDATA_BRANCHES_DEFAULT_TTL_HOURS=48
- HATIDATA_COT_ENABLED=true
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-research-data:/data
healthcheck:
test: ["CMD", "pg_isready", "-h", "localhost", "-p", "5439"]
interval: 5s
timeout: 3s
retries: 10
embedding-service:
image: ghcr.io/hatidata/embedding-service:latest
ports:
- "8090:8090"
environment:
- MODEL_NAME=bge-small-en-v1.5
- LISTEN_ADDR=0.0.0.0:8090
volumes:
hatidata-research-data:
.env
# Paste your real API key here (from dashboard or dev seed)
HATIDATA_API_KEY=hd_live_your_actual_key_from_dashboard
Start
docker compose up -d
Verify
# Create a research dataset
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
CREATE TABLE research_papers (
paper_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
abstract TEXT,
category TEXT,
citation_count INTEGER DEFAULT 0
);
INSERT INTO research_papers VALUES
('p1', 'Attention Is All You Need', 'We propose a new architecture...', 'transformers', 95000),
('p2', 'BERT: Pre-training of Deep Bidirectional Transformers', 'We introduce BERT...', 'transformers', 78000),
('p3', 'GPT-4 Technical Report', 'We report the development...', 'language_models', 12000);
"
# Verify branching works: create a branch, modify data, query it
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT branch_create('hypothesis-1', 'research-agent', 'Test citation threshold hypothesis', 48);
"
# The branch_create function returns the branch_id.
# Use it with branch_query and branch_write in your agent code.
# See the Research Agent Tutorial for the full workflow.
See the Research Agent Branching Tutorial for the complete implementation with Python code.
Recipe 3: Multi-Agent Pipeline with Shared Memory
Three agents (ingestion, analysis, reporting) sharing a common memory pool, with CoT logging for full traceability.
docker-compose.yml
version: "3.8"
services:
hatidata-proxy:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5439:5439"
environment:
- HATIDATA_LISTEN_ADDR=0.0.0.0:5439
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_EMBEDDING_ENABLED=true
- HATIDATA_EMBEDDING_MODEL=bge-small-en-v1.5
- HATIDATA_COT_ENABLED=true
- HATIDATA_TRIGGERS_ENABLED=true
- HATIDATA_BRANCHES_ENABLED=true
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-pipeline-data:/data
healthcheck:
test: ["CMD", "pg_isready", "-h", "localhost", "-p", "5439"]
interval: 5s
timeout: 3s
retries: 10
hatidata-mcp:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5440:5440"
environment:
- HATIDATA_MCP_LISTEN_ADDR=0.0.0.0:5440
- HATIDATA_MCP_ENABLED=true
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-pipeline-data:/data
depends_on:
hatidata-proxy:
condition: service_healthy
embedding-service:
image: ghcr.io/hatidata/embedding-service:latest
ports:
- "8090:8090"
environment:
- MODEL_NAME=bge-small-en-v1.5
- LISTEN_ADDR=0.0.0.0:8090
volumes:
hatidata-pipeline-data:
.env
# Paste your real API key here (from dashboard or dev seed)
HATIDATA_API_KEY=hd_live_your_actual_key_from_dashboard
Start
docker compose up -d
Verify
# Set up the shared pipeline tables
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
CREATE TABLE pipeline_items (
item_id TEXT PRIMARY KEY,
source TEXT NOT NULL,
raw_content TEXT,
status TEXT DEFAULT 'pending',
ingested_by TEXT,
analyzed_by TEXT,
ingested_at TIMESTAMP DEFAULT NOW(),
analyzed_at TIMESTAMP
);
CREATE TABLE pipeline_reports (
report_id TEXT PRIMARY KEY,
item_id TEXT REFERENCES pipeline_items(item_id),
summary TEXT,
risk_score FLOAT,
generated_by TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
"
# Simulate the ingestion agent storing a memory
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
INSERT INTO _hatidata_agent_memory (agent_id, content, metadata)
VALUES (
'ingestion-agent',
'Ingested SEC filing 10-K from ACME Corp. Revenue up 12% YoY. Flagged for analysis.',
'{\"source\": \"sec_edgar\", \"filing_type\": \"10-K\", \"company\": \"ACME Corp\"}'
);
"
# Simulate the analysis agent searching shared memory
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT agent_id, content, semantic_match(content, 'revenue growth filing') AS similarity
FROM _hatidata_agent_memory
ORDER BY semantic_rank(content, 'revenue growth filing')
LIMIT 5;
"
# Verify CoT logging works
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
INSERT INTO _hatidata_cot (agent_id, session_id, step_type, content)
VALUES ('analysis-agent', 'pipeline-run-001', 'observation', 'Received 10-K filing for ACME Corp from ingestion agent.');
"
# Verify MCP server is reachable
curl -s -X POST http://localhost:5440/mcp \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${HATIDATA_API_KEY}" \
-d '{"jsonrpc": "2.0", "method": "tools/list", "id": 1}' | python3 -m json.tool
If all queries succeed and the MCP server returns a list of tools, the multi-agent pipeline is ready. See the Multi-Agent Memory Tutorial for the full Python implementation.
Connecting from Python
All three recipes expose the same Postgres wire protocol on port 5439. Connect from Python using the HatiData SDK:
import os
from hatidata import HatiDataClient
client = HatiDataClient(
host="localhost",
port=5439,
api_key=os.environ["HATIDATA_API_KEY"], # Your real API key
)
# Test the connection
rows = client.query("SELECT 1 AS connected")
print(f"Connected: {rows[0]['connected']}")
Or with any Postgres driver (psycopg2, asyncpg, SQLAlchemy):
import os
import psycopg2
conn = psycopg2.connect(
host="localhost",
port=5439,
user="admin",
password=os.environ["HATIDATA_API_KEY"], # API key as password
dbname="hatidata",
)
Cleanup
Stop and remove containers and volumes for any recipe:
# Stop containers
docker compose down
# Stop and remove volumes (deletes all data)
docker compose down -v
Related Concepts
- Quickstart -- Minimal setup without Docker
- Architecture in 60 Seconds -- How the components fit together
- Build a Support Agent -- Full support agent tutorial
- Research Agent Branching -- Full research agent tutorial
- Multi-Agent Memory -- Full multi-agent tutorial
- MCP Setup -- Configuring MCP clients