What is ANDI?
Agent-Native Data Infrastructure (ANDI) is the data layer purpose-built for AI agents. It provides the five capabilities every production agent needs — and that no combination of existing tools delivers in a single, integrated system:
-
Identity-aware SQL — Agents authenticate, execute queries, and receive results through a single Postgres-compatible connection. Every query carries the agent's identity for billing, audit, and policy enforcement.
-
Persistent memory — Agents store and retrieve long-term context using a hybrid SQL + vector search system. Memories survive restarts, span sessions, and can be shared between agents.
-
Verifiable reasoning — Every reasoning step is recorded in an immutable, cryptographically hash-chained ledger. Decisions can be replayed, verified, and audited months later.
-
Reactive governance — Semantic triggers fire when agent activity matches concept-based rules. Actions range from webhook notifications to human-in-the-loop review gates.
-
Safe exploration — Agents create isolated branches of their data environment, run speculative queries, and merge results back — or discard them. Copy-on-write isolation means production data is never at risk.
The ANDI Stack Position
ANDI sits between AI agents and their data sources. It is not a database, not an LLM framework, and not a vector store — it is the infrastructure layer that connects all three:
┌─────────────────────────────────────────────────────┐
│ AI Agents (LangChain, CrewAI, AutoGen, custom) │
├─────────────────────────────────────────────────────┤
│ ANDI Layer (HatiData) │
│ Identity · Memory · Reasoning · Triggers · Branches │
│ SQL Transpilation · Policy Engine · Audit Chain │
├─────────────────────────────────────────────────────┤
│ Execution Engine │
├─────────────────────────────────────────────────────┤
│ Storage (Local files / S3 / GCS / Azure Blob) │
└─────────────────────────────────────────────────────┘
Agents connect via the Postgres wire protocol. Existing SQL, BI tools, and dbt models work without changes. The ANDI layer adds agent-native capabilities on top — identity, memory, reasoning traces, semantic triggers, and branch isolation — without requiring agents to adopt a proprietary API.
Why a New Category?
AI agents have different requirements than human analysts:
| Requirement | Human Analyst | AI Agent |
|---|---|---|
| Query frequency | Tens per day | Hundreds per reasoning chain |
| Latency tolerance | Seconds acceptable | Sub-10ms required |
| Identity | Username/password | Per-agent keys with scopes |
| Memory | Notes, dashboards | Persistent SQL + vector hybrid |
| Audit | Compliance reports | Hash-chained reasoning traces |
| Governance | Manual review | Semantic triggers + auto-escalation |
| Experimentation | Staging environments | Branch isolation per hypothesis |
Legacy cloud data warehouses were built for the left column. ANDI is built for the right column.
What ANDI Is Not
Not a database. HatiData includes an embedded columnar execution engine, but ANDI is the layer above: identity, memory, reasoning, triggers, branching, transpilation, and governance. You do not interact with the engine directly.
Not a vector store. HatiData includes built-in vector search for memory and triggers, but it is not a standalone vector database. Vector search is one capability within the broader ANDI stack.
Not an LLM framework. HatiData integrates with LangChain, CrewAI, AutoGen, and any MCP-compatible agent, but it does not orchestrate agents. Agents call HatiData — HatiData does not call agents.
Not an API gateway. HatiData does not proxy LLM API calls. It handles data — queries, memory, reasoning traces, triggers, and branches.
The Five Pillars
1. Agent Identity
Every agent connects with a unique identity (agent_id + framework) passed through Postgres startup parameters. This identity flows through the entire system:
- Policy engine — ABAC rules match on agent identity (e.g., "LangChain agents cannot query PII tables after hours")
- Row-level security — WHERE clauses use
{agent_id}placeholders resolved at query time - Billing — Per-agent credit tracking and quota enforcement
- Audit — Every query is attributed to a specific agent
Learn more about the Agent Identity Model →
2. Persistent Memory
Agents store facts, preferences, episodes, and context in a hybrid SQL + vector system. Memories are searchable by natural language (vector similarity) and by structured metadata (SQL filters). The system includes:
- 5 MCP tools —
store_memory,search_memory,get_agent_state,set_agent_state,delete_memory - Async embedding — Memories are stored immediately; embeddings are generated in the background
- Access tracking — Lock-free counters track retrieval frequency for importance scoring
Learn more about Persistent Memory →
3. Chain-of-Thought Ledger
Every reasoning step — observations, hypotheses, tool calls, decisions, final answers — is recorded in an append-only ledger with SHA-256 hash chaining. Features include:
- 12 step types covering the full agent reasoning lifecycle
- Hash chain verification — any tampering breaks the chain and is immediately detectable
- Embedding sampling — configurable fraction of steps are embedded for semantic search
- 3 MCP tools —
log_reasoning_step,replay_decision,get_session_history
Learn more about the Chain-of-Thought Ledger →
4. Semantic Triggers
Define concept-based rules that fire when agent activity matches a semantic pattern. The two-stage evaluation pipeline (ANN pre-filter + exact cosine verification) handles thousands of triggers with negligible latency:
- 4 action types — Webhook (HMAC-signed), AgentNotify (offline inbox), WriteEvent, FlagForReview
- Cooldown debounce — prevents alert storms from repeated similar queries
- 4 MCP tools —
register_trigger,list_triggers,delete_trigger,test_trigger
Learn more about Semantic Triggers →
5. Branch Isolation
Agents create isolated copies of their data environment using schema-level isolation. Branches start as zero-copy views (instant creation regardless of data size) and materialize tables on first write (copy-on-write):
- 3 merge strategies —
ours(branch wins),theirs(main wins),fail_on_conflict(abort on conflict) - Automatic garbage collection — TTL expiry and reference counting
- 5 MCP tools —
branch_create,branch_query,branch_merge,branch_discard,branch_list
Learn more about Branch Isolation →
Getting Started
The fastest path from zero to a working ANDI deployment:
- Quickstart — Install and run your first query in under 5 minutes
- Architecture in 60 Seconds — Understand the two-plane design
- When to Use HatiData — Decision guide for your use case
Next Steps
- Core Concepts — Deep dive into each ANDI pillar
- API Reference — Control plane REST API
- MCP Tools — All 24 agent tools
- Integrations — Connect your framework