Skip to main content

Docker Quickstart Recipes

Three copy-paste Docker Compose setups for common agent scenarios. Each recipe includes a docker-compose.yml, .env file, startup command, and a verification query to confirm everything is working.

Prerequisites

  • Docker and Docker Compose v2 installed
  • A HatiData API key

Getting Your API Key

HatiData always requires authentication -- there is no anonymous or unauthenticated mode, even for local development. Every connection to the SQL proxy (port 5439) and MCP server (port 5440) must include a valid API key.

  1. Create an account at hatidata.com/signup
  2. Open the dashboard at app.hatidata.com
  3. Navigate to API Keys in the sidebar
  4. Click Create Key and choose the developer scope
  5. Copy the key (it starts with hd_live_ or hd_test_) -- you will only see it once

Option B: Use the Full Dev Stack with Pre-Seeded Keys

If you clone the HatiData repository and run the full dev stack (cd dev && make up), the control plane auto-seeds demo API keys on first startup:

KeyOrgTier
hd_agent_DemoFreeAgentstartLabs000000001AgentStart LabsFree
hd_agent_DemoCloudNexaflowAI000000000001NexaFlow AICloud
hd_agent_DemoGrowthVelocityData000000001Velocity Data CoGrowth
hd_agent_DemoEnterpriseSentinelFin000001Global Sentinel FinancialEnterprise

These keys are validated by the control plane using Argon2id hashes -- they are real keys, not dev bypass tokens.

warning

The Docker Compose recipes below use a simplified single-container setup. Replace ${HATIDATA_API_KEY} in the .env file with your actual key from Option A or Option B. Placeholder values like hd_dev_* will be rejected -- all keys must be valid 40-character strings with a recognized prefix (hd_live_, hd_test_, or hd_agent_).


Recipe 1: Support Agent with Memory + Triggers

A support agent that remembers past interactions and fires a webhook when it detects frustrated customers.

docker-compose.yml

version: "3.8"

services:
hatidata-proxy:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5439:5439"
environment:
- HATIDATA_LISTEN_ADDR=0.0.0.0:5439
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_EMBEDDING_ENABLED=true
- HATIDATA_EMBEDDING_MODEL=bge-small-en-v1.5
- HATIDATA_TRIGGERS_ENABLED=true
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-support-data:/data
healthcheck:
test: ["CMD", "pg_isready", "-h", "localhost", "-p", "5439"]
interval: 5s
timeout: 3s
retries: 10

embedding-service:
image: ghcr.io/hatidata/embedding-service:latest
ports:
- "8090:8090"
environment:
- MODEL_NAME=bge-small-en-v1.5
- LISTEN_ADDR=0.0.0.0:8090

volumes:
hatidata-support-data:

.env

# Paste your real API key here (from dashboard or dev seed)
HATIDATA_API_KEY=hd_live_your_actual_key_from_dashboard

Start

docker compose up -d

Wait for the health check to pass (about 10 seconds), then verify:

Verify

# Connect and store a memory
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
INSERT INTO _hatidata_agent_memory (agent_id, content, metadata)
VALUES (
'support-agent',
'Customer reported billing issue with invoice #4521. Resolved by issuing credit.',
'{\"ticket_id\": \"T-4521\", \"category\": \"billing\", \"resolution\": \"credit_issued\"}'
);
"

# Search by semantic similarity
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT content, semantic_match(content, 'billing problem') AS similarity
FROM _hatidata_agent_memory
WHERE agent_id = 'support-agent'
ORDER BY semantic_rank(content, 'billing problem')
LIMIT 5;
"

# Register a trigger for frustrated customers
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT register_trigger(
'frustration-detector',
'support-agent',
'customer is angry, frustrated, or threatening to cancel',
0.75,
'webhook',
'{\"url\": \"http://host.docker.internal:9000/alerts\"}'
);
"

If the SELECT returns your stored memory with a similarity score, the setup is working. See the Support Agent Tutorial for the full agent implementation.


Recipe 2: Research Agent with Branching

A research agent that explores hypotheses in isolated branches, merging successful findings back to main.

docker-compose.yml

version: "3.8"

services:
hatidata-proxy:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5439:5439"
environment:
- HATIDATA_LISTEN_ADDR=0.0.0.0:5439
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_EMBEDDING_ENABLED=true
- HATIDATA_EMBEDDING_MODEL=bge-small-en-v1.5
- HATIDATA_BRANCHES_ENABLED=true
- HATIDATA_BRANCHES_MAX_PER_AGENT=10
- HATIDATA_BRANCHES_DEFAULT_TTL_HOURS=48
- HATIDATA_COT_ENABLED=true
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-research-data:/data
healthcheck:
test: ["CMD", "pg_isready", "-h", "localhost", "-p", "5439"]
interval: 5s
timeout: 3s
retries: 10

embedding-service:
image: ghcr.io/hatidata/embedding-service:latest
ports:
- "8090:8090"
environment:
- MODEL_NAME=bge-small-en-v1.5
- LISTEN_ADDR=0.0.0.0:8090

volumes:
hatidata-research-data:

.env

# Paste your real API key here (from dashboard or dev seed)
HATIDATA_API_KEY=hd_live_your_actual_key_from_dashboard

Start

docker compose up -d

Verify

# Create a research dataset
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
CREATE TABLE research_papers (
paper_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
abstract TEXT,
category TEXT,
citation_count INTEGER DEFAULT 0
);

INSERT INTO research_papers VALUES
('p1', 'Attention Is All You Need', 'We propose a new architecture...', 'transformers', 95000),
('p2', 'BERT: Pre-training of Deep Bidirectional Transformers', 'We introduce BERT...', 'transformers', 78000),
('p3', 'GPT-4 Technical Report', 'We report the development...', 'language_models', 12000);
"

# Verify branching works: create a branch, modify data, query it
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT branch_create('hypothesis-1', 'research-agent', 'Test citation threshold hypothesis', 48);
"

# The branch_create function returns the branch_id.
# Use it with branch_query and branch_write in your agent code.
# See the Research Agent Tutorial for the full workflow.

See the Research Agent Branching Tutorial for the complete implementation with Python code.


Recipe 3: Multi-Agent Pipeline with Shared Memory

Three agents (ingestion, analysis, reporting) sharing a common memory pool, with CoT logging for full traceability.

docker-compose.yml

version: "3.8"

services:
hatidata-proxy:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5439:5439"
environment:
- HATIDATA_LISTEN_ADDR=0.0.0.0:5439
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_EMBEDDING_ENABLED=true
- HATIDATA_EMBEDDING_MODEL=bge-small-en-v1.5
- HATIDATA_COT_ENABLED=true
- HATIDATA_TRIGGERS_ENABLED=true
- HATIDATA_BRANCHES_ENABLED=true
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-pipeline-data:/data
healthcheck:
test: ["CMD", "pg_isready", "-h", "localhost", "-p", "5439"]
interval: 5s
timeout: 3s
retries: 10

hatidata-mcp:
image: ghcr.io/hatidata/hatidata-proxy:latest
ports:
- "5440:5440"
environment:
- HATIDATA_MCP_LISTEN_ADDR=0.0.0.0:5440
- HATIDATA_MCP_ENABLED=true
- HATIDATA_DATA_DIR=/data
- HATIDATA_AUTH_API_KEY=${HATIDATA_API_KEY}
- HATIDATA_LOG_LEVEL=info
volumes:
- hatidata-pipeline-data:/data
depends_on:
hatidata-proxy:
condition: service_healthy

embedding-service:
image: ghcr.io/hatidata/embedding-service:latest
ports:
- "8090:8090"
environment:
- MODEL_NAME=bge-small-en-v1.5
- LISTEN_ADDR=0.0.0.0:8090

volumes:
hatidata-pipeline-data:

.env

# Paste your real API key here (from dashboard or dev seed)
HATIDATA_API_KEY=hd_live_your_actual_key_from_dashboard

Start

docker compose up -d

Verify

# Set up the shared pipeline tables
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
CREATE TABLE pipeline_items (
item_id TEXT PRIMARY KEY,
source TEXT NOT NULL,
raw_content TEXT,
status TEXT DEFAULT 'pending',
ingested_by TEXT,
analyzed_by TEXT,
ingested_at TIMESTAMP DEFAULT NOW(),
analyzed_at TIMESTAMP
);

CREATE TABLE pipeline_reports (
report_id TEXT PRIMARY KEY,
item_id TEXT REFERENCES pipeline_items(item_id),
summary TEXT,
risk_score FLOAT,
generated_by TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
"

# Simulate the ingestion agent storing a memory
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
INSERT INTO _hatidata_agent_memory (agent_id, content, metadata)
VALUES (
'ingestion-agent',
'Ingested SEC filing 10-K from ACME Corp. Revenue up 12% YoY. Flagged for analysis.',
'{\"source\": \"sec_edgar\", \"filing_type\": \"10-K\", \"company\": \"ACME Corp\"}'
);
"

# Simulate the analysis agent searching shared memory
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
SELECT agent_id, content, semantic_match(content, 'revenue growth filing') AS similarity
FROM _hatidata_agent_memory
ORDER BY semantic_rank(content, 'revenue growth filing')
LIMIT 5;
"

# Verify CoT logging works
PGPASSWORD="${HATIDATA_API_KEY}" psql -h localhost -p 5439 -U admin -c "
INSERT INTO _hatidata_cot (agent_id, session_id, step_type, content)
VALUES ('analysis-agent', 'pipeline-run-001', 'observation', 'Received 10-K filing for ACME Corp from ingestion agent.');
"

# Verify MCP server is reachable
curl -s -X POST http://localhost:5440/mcp \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${HATIDATA_API_KEY}" \
-d '{"jsonrpc": "2.0", "method": "tools/list", "id": 1}' | python3 -m json.tool

If all queries succeed and the MCP server returns a list of tools, the multi-agent pipeline is ready. See the Multi-Agent Memory Tutorial for the full Python implementation.


Connecting from Python

All three recipes expose the same Postgres wire protocol on port 5439. Connect from Python using the HatiData SDK:

import os
from hatidata import HatiDataClient

client = HatiDataClient(
host="localhost",
port=5439,
api_key=os.environ["HATIDATA_API_KEY"], # Your real API key
)

# Test the connection
rows = client.query("SELECT 1 AS connected")
print(f"Connected: {rows[0]['connected']}")

Or with any Postgres driver (psycopg2, asyncpg, SQLAlchemy):

import os
import psycopg2

conn = psycopg2.connect(
host="localhost",
port=5439,
user="admin",
password=os.environ["HATIDATA_API_KEY"], # API key as password
dbname="hatidata",
)

Cleanup

Stop and remove containers and volumes for any recipe:

# Stop containers
docker compose down

# Stop and remove volumes (deletes all data)
docker compose down -v

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.