Skip to main content

What is ANDI?

Agent-Native Data Infrastructure (ANDI) is the data layer purpose-built for AI agents. It provides the five capabilities every production agent needs — and that no combination of existing tools delivers in a single, integrated system:

  1. Identity-aware SQL — Agents authenticate, execute queries, and receive results through a single Postgres-compatible connection. Every query carries the agent's identity for billing, audit, and policy enforcement.

  2. Persistent memory — Agents store and retrieve long-term context using a hybrid SQL + vector search system. Memories survive restarts, span sessions, and can be shared between agents.

  3. Verifiable reasoning — Every reasoning step is recorded in an immutable, cryptographically hash-chained ledger. Decisions can be replayed, verified, and audited months later.

  4. Reactive governance — Semantic triggers fire when agent activity matches concept-based rules. Actions range from webhook notifications to human-in-the-loop review gates.

  5. Safe exploration — Agents create isolated branches of their data environment, run speculative queries, and merge results back — or discard them. Copy-on-write isolation means production data is never at risk.

The ANDI Stack Position

ANDI sits between AI agents and their data sources. It is not a database, not an LLM framework, and not a vector store — it is the infrastructure layer that connects all three:

┌─────────────────────────────────────────────────────┐
│ AI Agents (LangChain, CrewAI, AutoGen, custom) │
├─────────────────────────────────────────────────────┤
│ ANDI Layer (HatiData) │
│ Identity · Memory · Reasoning · Triggers · Branches │
│ SQL Transpilation · Policy Engine · Audit Chain │
├─────────────────────────────────────────────────────┤
│ Execution Engine │
├─────────────────────────────────────────────────────┤
│ Storage (Local files / S3 / GCS / Azure Blob) │
└─────────────────────────────────────────────────────┘

Agents connect via the Postgres wire protocol. Existing SQL, BI tools, and dbt models work without changes. The ANDI layer adds agent-native capabilities on top — identity, memory, reasoning traces, semantic triggers, and branch isolation — without requiring agents to adopt a proprietary API.

Why a New Category?

AI agents have different requirements than human analysts:

RequirementHuman AnalystAI Agent
Query frequencyTens per dayHundreds per reasoning chain
Latency toleranceSeconds acceptableSub-10ms required
IdentityUsername/passwordPer-agent keys with scopes
MemoryNotes, dashboardsPersistent SQL + vector hybrid
AuditCompliance reportsHash-chained reasoning traces
GovernanceManual reviewSemantic triggers + auto-escalation
ExperimentationStaging environmentsBranch isolation per hypothesis

Legacy cloud data warehouses were built for the left column. ANDI is built for the right column.

What ANDI Is Not

Not a database. HatiData includes an embedded columnar execution engine, but ANDI is the layer above: identity, memory, reasoning, triggers, branching, transpilation, and governance. You do not interact with the engine directly.

Not a vector store. HatiData includes built-in vector search for memory and triggers, but it is not a standalone vector database. Vector search is one capability within the broader ANDI stack.

Not an LLM framework. HatiData integrates with LangChain, CrewAI, AutoGen, and any MCP-compatible agent, but it does not orchestrate agents. Agents call HatiData — HatiData does not call agents.

Not an API gateway. HatiData does not proxy LLM API calls. It handles data — queries, memory, reasoning traces, triggers, and branches.

The Five Pillars

1. Agent Identity

Every agent connects with a unique identity (agent_id + framework) passed through Postgres startup parameters. This identity flows through the entire system:

  • Policy engine — ABAC rules match on agent identity (e.g., "LangChain agents cannot query PII tables after hours")
  • Row-level security — WHERE clauses use {agent_id} placeholders resolved at query time
  • Billing — Per-agent credit tracking and quota enforcement
  • Audit — Every query is attributed to a specific agent

Learn more about the Agent Identity Model →

2. Persistent Memory

Agents store facts, preferences, episodes, and context in a hybrid SQL + vector system. Memories are searchable by natural language (vector similarity) and by structured metadata (SQL filters). The system includes:

  • 5 MCP toolsstore_memory, search_memory, get_agent_state, set_agent_state, delete_memory
  • Async embedding — Memories are stored immediately; embeddings are generated in the background
  • Access tracking — Lock-free counters track retrieval frequency for importance scoring

Learn more about Persistent Memory →

3. Chain-of-Thought Ledger

Every reasoning step — observations, hypotheses, tool calls, decisions, final answers — is recorded in an append-only ledger with SHA-256 hash chaining. Features include:

  • 12 step types covering the full agent reasoning lifecycle
  • Hash chain verification — any tampering breaks the chain and is immediately detectable
  • Embedding sampling — configurable fraction of steps are embedded for semantic search
  • 3 MCP toolslog_reasoning_step, replay_decision, get_session_history

Learn more about the Chain-of-Thought Ledger →

4. Semantic Triggers

Define concept-based rules that fire when agent activity matches a semantic pattern. The two-stage evaluation pipeline (ANN pre-filter + exact cosine verification) handles thousands of triggers with negligible latency:

  • 4 action types — Webhook (HMAC-signed), AgentNotify (offline inbox), WriteEvent, FlagForReview
  • Cooldown debounce — prevents alert storms from repeated similar queries
  • 4 MCP toolsregister_trigger, list_triggers, delete_trigger, test_trigger

Learn more about Semantic Triggers →

5. Branch Isolation

Agents create isolated copies of their data environment using schema-level isolation. Branches start as zero-copy views (instant creation regardless of data size) and materialize tables on first write (copy-on-write):

  • 3 merge strategiesours (branch wins), theirs (main wins), fail_on_conflict (abort on conflict)
  • Automatic garbage collection — TTL expiry and reference counting
  • 5 MCP toolsbranch_create, branch_query, branch_merge, branch_discard, branch_list

Learn more about Branch Isolation →

Getting Started

The fastest path from zero to a working ANDI deployment:

  1. Quickstart — Install and run your first query in under 5 minutes
  2. Architecture in 60 Seconds — Understand the two-plane design
  3. When to Use HatiData — Decision guide for your use case

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.