Deployment Modes
HatiData runs in four deployment modes — from a single-command local setup for development to Kubernetes and Terraform-managed enterprise deployments in your own VPC. The data plane (proxy) always runs in your infrastructure; the control plane is managed by HatiData cloud.
| Mode | Use Case | Time to Start |
|---|---|---|
| Local | Development, testing, CI | 30 seconds |
| Docker | Containerized dev / staging | 2 minutes |
| Docker Compose | Local multi-service stack | 2 minutes |
| Cloud ($29/mo) | Managed proxy, zero ops | 5 minutes |
| Kubernetes | Production, multi-tenant | 30 minutes |
| Enterprise / Terraform | Custom VPC, air-gapped, multi-cloud | Custom |
Local Mode
Local mode runs entirely on your machine with no cloud dependencies. The HatiData engine handles storage locally; no S3, no cloud credentials, no network required.
Quick Start
# Install the CLI
pip install hatidata-cli
# Initialize a local project
hati init my-project
cd my-project
# Start the local proxy (port 5439)
hati dev
hati init creates a hatidata.toml configuration file and a local data directory.
Connect
psql -h localhost -p 5439 -U admin -d hatidata
Load Data
# Push a local Parquet or CSV file into HatiData
hati push data/orders.parquet --table orders
# Push an entire directory
hati push data/ --format parquet
# Pull a table back to a local file
hati pull orders --output orders_export.parquet
hatidata.toml
[proxy]
host = "0.0.0.0"
port = 5439
[storage]
provider = "local"
data_dir = ".hatidata/data"
[features]
ai_heal = true
cost_estimation = false
audit_log = true
In local mode, authentication is handled via a local JWT secret configured in the hatidata.toml or via the HATIDATA_JWT_SECRET environment variable.
Docker
Run the HatiData proxy as a standalone Docker container:
docker run -d \
--name hatidata-proxy \
-p 5439:5439 \
-e HATIDATA_JWT_SECRET=my-dev-secret \
-e HATIDATA_STORAGE_PROVIDER=local \
-v $(pwd)/data:/data \
ghcr.io/hatios-ai/hatidata-proxy:latest
Connect:
psql -h localhost -p 5439 -U admin -d hatidata
Docker Environment Variables
| Variable | Default | Description |
|---|---|---|
HATIDATA_JWT_SECRET | — | JWT signing secret (required) |
HATIDATA_STORAGE_PROVIDER | local | Storage backend: local, s3, gcs, azure |
HATIDATA_STORAGE_BUCKET | — | Bucket name (cloud storage) |
HATIDATA_CLOUD_REGION | us-east-1 | Cloud provider region |
HATIDATA_CONTROL_PLANE_URL | — | Control plane endpoint (managed cloud) |
HATIDATA_API_KEY | — | API key for control plane authentication |
HATIDATA_KMS_PROVIDER | local | Key management: local, aws, gcp, azure |
HATIDATA_AI_HEAL | true | Auto-correct failed queries |
HATIDATA_COST_ESTIMATION_ENABLED | false | Send cost estimates before execution |
HATIDATA_QUERY_TIMEOUT_SECS | 300 | Maximum query execution time |
HATIDATA_LOG_LEVEL | info | Log level: trace, debug, info, warn, error |
Docker Compose
The recommended setup for local multi-service development. Runs the proxy, control plane, dashboard, and supporting services together.
# Clone the dev stack
git clone https://github.com/HatiOS-AI/HatiData
cd HatiData/dev
# Start all services
make up
# Stop all services
make down
# Run a smoke test
make test-query
docker-compose.yml (excerpt)
services:
proxy:
image: ghcr.io/hatios-ai/hatidata-proxy:latest
ports:
- "5439:5439"
environment:
HATIDATA_JWT_SECRET: local-dev-secret
HATIDATA_STORAGE_PROVIDER: local
HATIDATA_CONTROL_PLANE_URL: http://control-plane:8080
depends_on:
- control-plane
control-plane:
image: ghcr.io/hatios-ai/hatidata-control-plane:latest
ports:
- "8080:8080"
environment:
DATABASE_URL: postgres://hatidata:hatidata@postgres:5432/hatidata
depends_on:
- postgres
dashboard:
image: ghcr.io/hatios-ai/hatidata-dashboard:latest
ports:
- "3000:3000"
postgres:
image: postgres:15
environment:
POSTGRES_DB: hatidata
POSTGRES_USER: hatidata
POSTGRES_PASSWORD: hatidata
object-storage:
image: minio/minio
ports:
- "9000:9000"
- "9001:9001"
command: server /data --console-address ":9001"
vector-db:
image: qdrant/qdrant
ports:
- "6333:6333"
- "6334:6334"
embedding:
image: ghcr.io/hatios-ai/embedding-sidecar:latest
ports:
- "8090:8090"
Service Ports
| Service | Port | Purpose |
|---|---|---|
| Proxy | 5439 | PostgreSQL wire protocol (connect your tools here) |
| MCP Server | 5440 | Model Context Protocol (HTTP/SSE, 24 agent tools) |
| Control Plane | 8080 | REST API (agent management, billing, policies) |
| Dashboard | 3000 | Web UI |
| Prometheus Metrics | 9090 | Proxy metrics exporter |
| PostgreSQL | 5432 | Control plane metadata |
| Object Storage | 9000 | Local S3-compatible storage (dev) |
| Vector Database | 6333 / 6334 | Semantic search engine (HTTP / gRPC) |
| Embedding Service | 8090 | Local embedding model server |
Cloud Mode
HatiData Cloud provisions a managed proxy in your chosen region. You push data from the CLI; HatiData manages the proxy infrastructure, upgrades, and backups.
Sign Up
Visit app.hatidata.com to create an account. The Free tier includes 1 GB storage and 1,000 queries/month. Paid plans start at $29/month.
Push Data
# Authenticate
hati login
# Push a local table to your cloud data layer
hati push data/orders.parquet --table orders
# Pull a table to a local file
hati pull orders --output orders_local.parquet
Cloud Connection
After provisioning, your proxy endpoint is available in the dashboard:
psql "host=your-org.proxy.hatidata.com port=5439 dbname=hatidata user=admin sslmode=require"
Use your API key (hd_live_*) as the password.
Cloud Environment Variables
For programmatic access:
export HATIDATA_HOST=your-org.proxy.hatidata.com
export HATIDATA_PORT=5439
export HATIDATA_API_KEY=hd_live_your_api_key
export HATIDATA_DATABASE=hatidata
Kubernetes (Helm)
For production workloads, deploy HatiData into your Kubernetes cluster using the official Helm chart.
Prerequisites
- Kubernetes >= 1.27
- Helm >= 3.12
- kubectl configured for your cluster
Install
# Add the HatiData Helm repository
helm repo add hatidata https://charts.hatidata.com
helm repo update
# Install with default values
helm install hatidata hatidata/hatidata \
--namespace hatidata \
--create-namespace
# Install with custom values
helm install hatidata hatidata/hatidata \
--namespace hatidata \
--create-namespace \
--values values.yaml
values.yaml
proxy:
replicaCount: 3
image:
repository: ghcr.io/hatios-ai/hatidata-proxy
tag: latest
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2"
memory: "4Gi"
service:
type: LoadBalancer
port: 5439
storage:
provider: s3 # s3 | gcs | azure
bucket: my-hatidata-bucket
region: us-east-1
auth:
controlPlaneUrl: https://api.hatidata.com
apiKey:
secretName: hatidata-api-key
secretKey: api-key
tls:
enabled: true
certManager: true # Use cert-manager for TLS certificates
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
Verify Deployment
kubectl get pods -n hatidata
kubectl get svc -n hatidata
# Test connection through the LoadBalancer
psql "host=$(kubectl get svc -n hatidata hatidata-proxy -o jsonpath='{.status.loadBalancer.ingress[0].hostname}') port=5439 dbname=hatidata user=admin sslmode=require"
Enterprise Terraform
For enterprise deployments with custom VPC requirements, air-gapped environments, or multi-cloud configurations, HatiData provides Terraform modules for AWS, GCP, and Azure.
AWS
cd terraform/aws
terraform init
terraform plan -var-file=environments/production.tfvars
terraform apply
environment = "production"
region = "us-east-1"
vpc_id = "vpc-0abc123def456"
private_subnet_ids = ["subnet-0abc", "subnet-0def"]
instance_type = "r6i.xlarge"
min_capacity = 2
max_capacity = 10
storage_bucket = "my-company-hatidata-prod"
GCP
cd terraform/gcp
terraform init
terraform plan -var-file=environments/production.tfvars
terraform apply
project_id = "my-gcp-project"
region = "us-central1"
zone = "us-central1-a"
machine_type = "n2-highmem-4"
min_replicas = 2
max_replicas = 10
storage_bucket = "my-company-hatidata-prod"
Azure
cd terraform/azure
terraform init
terraform plan -var-file=environments/production.tfvars
terraform apply
Terraform Module Structure
| Module | Purpose |
|---|---|
terraform/aws/ | AWS deployment (EKS, RDS, S3, PrivateLink, KMS) |
terraform/gcp/ | GCP deployment (GKE, Cloud Run, CloudSQL, GCS, Secrets Manager) |
terraform/azure/ | Azure deployment (AKS, Azure Blob, Key Vault, Managed Identity) |
terraform/shared/ | Shared modules (DNS, TLS, monitoring) |
Contact enterprise@hatidata.com for air-gapped deployment guides, PrivateLink setup, and custom SLA options.
Related Concepts
- Postgres Drivers & BI Tools — Connect clients and BI tools to any deployment
- dbt Adapter — Run dbt models against your data layer
- MCP Setup — Connect AI agents via the MCP server
- Python SDK — Agent-aware queries with automatic attribution