What is ANDI?

Agent-Native Data Infrastructure (ANDI) is the data layer purpose-built for AI agents. It provides the five capabilities every production agent needs — and that no combination of existing tools delivers in a single, integrated system:

Identity-aware SQL — Agents authenticate, execute queries, and receive results through a single Postgres-compatible connection. Every query carries the agent's identity for billing, audit, and policy enforcement.
Persistent memory — Agents store and retrieve long-term context using a hybrid SQL + vector search system. Memories survive restarts, span sessions, and can be shared between agents.
Verifiable reasoning — Every reasoning step is recorded in an immutable, cryptographically hash-chained ledger. Decisions can be replayed, verified, and audited months later.
Reactive governance — Semantic triggers fire when agent activity matches concept-based rules. Actions range from webhook notifications to human-in-the-loop review gates.
Safe exploration — Agents create isolated branches of their data environment, run speculative queries, and merge results back — or discard them. Copy-on-write isolation means production data is never at risk.

The ANDI Stack Position

ANDI sits between AI agents and their data sources. It is not a database, not an LLM framework, and not a vector store — it is the infrastructure layer that connects all three:

┌─────────────────────────────────────────────────────┐
│  AI Agents (LangChain, CrewAI, AutoGen, custom)     │
├─────────────────────────────────────────────────────┤
│  ANDI Layer (HatiData)                              │
│    Identity · Memory · Reasoning · Triggers · Branches │
│    SQL Transpilation · Policy Engine · Audit Chain   │
├─────────────────────────────────────────────────────┤
│  Execution Engine                                    │
├─────────────────────────────────────────────────────┤
│  Storage (Local files / S3 / GCS / Azure Blob)      │
└─────────────────────────────────────────────────────┘

Agents connect via the Postgres wire protocol. Existing SQL, BI tools, and dbt models work without changes. The ANDI layer adds agent-native capabilities on top — identity, memory, reasoning traces, semantic triggers, and branch isolation — without requiring agents to adopt a proprietary API.

Why a New Category?

AI agents have different requirements than human analysts:

Requirement	Human Analyst	AI Agent
Query frequency	Tens per day	Hundreds per reasoning chain
Latency tolerance	Seconds acceptable	Sub-10ms required
Identity	Username/password	Per-agent keys with scopes
Memory	Notes, dashboards	Persistent SQL + vector hybrid
Audit	Compliance reports	Hash-chained reasoning traces
Governance	Manual review	Semantic triggers + auto-escalation
Experimentation	Staging environments	Branch isolation per hypothesis

Legacy cloud data warehouses were built for the left column. ANDI is built for the right column.

What ANDI Is Not

Not a database. HatiData includes an embedded columnar execution engine, but ANDI is the layer above: identity, memory, reasoning, triggers, branching, transpilation, and governance. You do not interact with the engine directly.

Not a vector store. HatiData includes built-in vector search for memory and triggers, but it is not a standalone vector database. Vector search is one capability within the broader ANDI stack.

Not an LLM framework. HatiData integrates with LangChain, CrewAI, AutoGen, and any MCP-compatible agent, but it does not orchestrate agents. Agents call HatiData — HatiData does not call agents.

Not an API gateway. HatiData does not proxy LLM API calls. It handles data — queries, memory, reasoning traces, triggers, and branches.

The Five Pillars

1. Agent Identity

Every agent connects with a unique identity (agent_id + framework) passed through Postgres startup parameters. This identity flows through the entire system:

Policy engine — ABAC rules match on agent identity (e.g., "LangChain agents cannot query PII tables after hours")
Row-level security — WHERE clauses use {agent_id} placeholders resolved at query time
Billing — Per-agent credit tracking and quota enforcement
Audit — Every query is attributed to a specific agent

Learn more about the Agent Identity Model →

2. Persistent Memory

Agents store facts, preferences, episodes, and context in a hybrid SQL + vector system. Memories are searchable by natural language (vector similarity) and by structured metadata (SQL filters). The system includes:

5 MCP tools — store_memory, search_memory, get_agent_state, set_agent_state, delete_memory
Async embedding — Memories are stored immediately; embeddings are generated in the background
Access tracking — Lock-free counters track retrieval frequency for importance scoring

Learn more about Persistent Memory →

3. Chain-of-Thought Ledger

Every reasoning step — observations, hypotheses, tool calls, decisions, final answers — is recorded in an append-only ledger with SHA-256 hash chaining. Features include:

12 step types covering the full agent reasoning lifecycle
Hash chain verification — any tampering breaks the chain and is immediately detectable
Embedding sampling — configurable fraction of steps are embedded for semantic search
3 MCP tools — log_reasoning_step, replay_decision, get_session_history

Learn more about the Chain-of-Thought Ledger →

4. Semantic Triggers

Define concept-based rules that fire when agent activity matches a semantic pattern. The two-stage evaluation pipeline (ANN pre-filter + exact cosine verification) handles thousands of triggers with negligible latency:

4 action types — Webhook (HMAC-signed), AgentNotify (offline inbox), WriteEvent, FlagForReview
Cooldown debounce — prevents alert storms from repeated similar queries
4 MCP tools — register_trigger, list_triggers, delete_trigger, test_trigger

Learn more about Semantic Triggers →

5. Branch Isolation

Agents create isolated copies of their data environment using schema-level isolation. Branches start as zero-copy views (instant creation regardless of data size) and materialize tables on first write (copy-on-write):

3 merge strategies — ours (branch wins), theirs (main wins), fail_on_conflict (abort on conflict)
Automatic garbage collection — TTL expiry and reference counting
5 MCP tools — branch_create, branch_query, branch_merge, branch_discard, branch_list

Learn more about Branch Isolation →

Getting Started

The fastest path from zero to a working ANDI deployment:

Quickstart — Install and run your first query in under 5 minutes
Architecture in 60 Seconds — Understand the two-plane design
When to Use HatiData — Decision guide for your use case

Next Steps

Core Concepts — Deep dive into each ANDI pillar
API Reference — Control plane REST API
MCP Tools — All 24 agent tools
Integrations — Connect your framework

The ANDI Stack Position​

Why a New Category?​

What ANDI Is Not​

The Five Pillars​

1. Agent Identity​

2. Persistent Memory​

3. Chain-of-Thought Ledger​

4. Semantic Triggers​

5. Branch Isolation​

Getting Started​

Next Steps​

Stay in the loop