Query Pipeline

Every SQL statement that enters HatiData's data plane passes through a 15-stage pipeline before results are returned. Each stage has a single responsibility and a defined failure mode. Stages that modify the query (transpilation, row filtering, column masking) produce a new query representation — the original statement from the agent is never executed directly.

Agent SQL
    |
    v
+-- Pre-Admission -------+
|  1. Startup gate       |
|  2. Auto-resume        |
|  3. Org status check   |
|  4. Resource governor  |
+------------------------+
    |
    v
+-- Admission -----------+
|  5. Concurrency gate   |
+------------------------+
    |
    v
+-- Security ------------+
|  6. Table extract      |
|  7. Append-only enforce|
|  8. Tier feature gate  |
|  9. Schema isolation   |
| 10. Policy check       |
| 11. Cost estimate      |
| 12. Quota check        |
+------------------------+
    |
    v
+-- Processing ----------+
| 13. Row filter         |
| 14. Vector embedding   |
| 15. Transpilation      |
| 16. Snapshot pin       |
+------------------------+
    |
    v
+-- Execution -----------+
| 17. Query engine       |
| 18. Dirty track + WAL  |
| 19. AI heal            |
+------------------------+
    |
    v
+-- Post-Process --------+
| 20. Column masking     |
| 21. Metering           |
+------------------------+
    |
    v
+-- Audit ---------------+
| 22. Hash-chained record|
+------------------------+
    |
    v
Results returned to agent

info

The pipeline has 15 logical stages (some grouped in the diagram above for clarity). The numbering in the diagram reflects the execution order across all sub-stages.

Pre-Admission

Four stages run before the query enters the admission queue. These are fast checks that reject queries early — before consuming a concurrency slot — when the proxy is not ready to serve, the engine is suspended, the organization is inactive, or rate limits are exceeded.

Startup Gate

The very first stage rejects all incoming queries until the proxy has completed its initial policy sync from the control plane. Without this gate, queries arriving during startup would execute with no policy rules loaded, effectively bypassing all access controls.

The startup gate blocks with a configurable timeout (HATIDATA_POLICY_SYNC_STARTUP_TIMEOUT_SECS, default: 30 seconds). If the control plane is unreachable and the timeout expires, the proxy's behavior depends on the degraded-mode flag:

Default (fail-closed): The proxy refuses all queries with a proxy_not_ready error until a successful sync completes. This is the safe default for production.
Degraded mode (HATIDATA_POLICY_SYNC_ALLOW_DEGRADED=true): The proxy begins accepting queries without policy rules. This is insecure and should only be used in dev/staging environments where the control plane may not be running. A warning is emitted to the audit log on every query served in degraded mode.

Once the first policy sync succeeds, the startup gate opens permanently for the lifetime of the process. Subsequent policy sync failures (after the first success) do not re-close the gate — the proxy continues with the last known-good policy bundle.

Auto-Resume

If the embedded query engine has been suspended due to inactivity, this stage resumes it before the query can proceed. Suspension happens automatically when no queries have been received for a configurable idle period (default: 5 seconds), and the engine's state is persisted to local cache for fast restoration.

Resumption is typically fast (under 500 milliseconds for a warm cache), but the stage enforces a hard timeout (HATIDATA_AUTO_SUSPEND_RESUME_TIMEOUT_SECS, default: 30 seconds). If the engine cannot resume within the timeout, the query is rejected with an engine_unavailable error.

Only the first query after suspension pays the resume cost. Subsequent queries find the engine already running and pass through this stage with no delay.

Org Status Check

This stage verifies that the organization associated with the incoming query is in an active state. The organization status is determined by the JWT or API key presented during authentication and is cached from the most recent control plane sync.

Queries are rejected with an org_access_denied error if the organization is in any of the following states:

Suspended: The organization has been administratively suspended (e.g., terms of service violation).
Delinquent: Payment is past due beyond the grace period.
Cancelled: The organization has cancelled their subscription.
PendingPayment: The organization signed up but has not yet completed the Stripe checkout flow.

Only organizations in Active or Provisioning status are permitted to execute queries.

Resource Governor

The resource governor enforces per-organization rate limiting and concurrency control, preventing any single tenant from monopolizing shared proxy resources in multi-tenant deployments.

Two mechanisms are applied:

Rate limiting (token bucket): Each organization has a token bucket that refills at a configured rate. Queries that arrive when the bucket is empty are rejected with a rate_limit_exceeded error. The bucket size and refill rate are derived from the organization's subscription tier.
Concurrency slots: Each organization is allocated a maximum number of concurrent in-flight queries. This is distinct from the global concurrency gate — even if global slots are available, an organization that has exhausted its per-org slots will be rejected.

The resource governor is disabled by default (HATIDATA_RESOURCE_GOVERNOR_ENABLED=false) and is intended for multi-tenant production deployments. Single-tenant data planes (dedicated proxy per customer) typically rely on the global concurrency gate alone.

Admission

Concurrency Gate

The first stage after pre-admission gates how many queries execute concurrently. HatiData uses a bounded semaphore rather than a reject-on-capacity-exceeded strategy: queries that arrive when all slots are taken are queued, not refused.

Why queue instead of reject? Agents often issue queries in bursts after being unblocked by a human-in-the-loop pause or a semantic trigger. Rejecting during bursts would require the agent to implement retry logic. Queuing absorbs the burst transparently.

The semaphore capacity is configurable per data plane instance (HATIDATA_MAX_CONCURRENT_QUERIES, default: 100). A separate, smaller semaphore (default: 8) governs full-table-scan operations, which have higher memory requirements.

Queued queries respect priority ordering: queries from agents with admin or analyst RBAC roles are dequeued before queries from developer roles when the semaphore is contested.

Security

Table Extraction

Before policy checks can run, the pipeline needs to know which tables the query touches. The SQL is parsed into an AST and all table references are extracted — including tables referenced inside subqueries, CTEs, and JOIN clauses.

The extracted table list is used in the policy check to verify the agent has access to every table it references. If parsing fails at this stage, a structured error is returned immediately without consuming quota.

Implementation note: Table extraction runs on the raw agent SQL, before any rewrites. This ensures policy is evaluated against the agent's stated intent, not against a transformed version of the query.

Append-Only Enforcement

This stage provides defense-in-depth protection for HatiData's chain-of-thought (CoT) ledger tables. Any UPDATE, DELETE, TRUNCATE, or DROP statement that targets a table matching the _hatidata_cot prefix is blocked with a cot_violation error, regardless of the agent's RBAC role or ABAC attributes.

The CoT ledger stores immutable, hash-chained reasoning traces. Allowing mutations would break the hash chain and compromise the audit integrity guarantees that downstream compliance systems depend on. This enforcement is intentionally placed before the policy check — even an agent with full admin privileges cannot mutate CoT tables through the query pipeline.

SELECT and INSERT statements against CoT tables are permitted (subject to normal policy checks in subsequent stages). The append-only constraint applies only to mutation and destruction operations.

Tier Feature Gate

Certain tables and schema namespaces are restricted to specific subscription tiers. The tier gate checks whether the agent's organization has a sufficient tier to access the tables extracted in the previous stage.

Current tier restrictions:

Resource pattern	Minimum tier	Rationale
`_hatidata_cot*` tables	Growth	Chain-of-thought ledger is a Growth+ feature
`branch_*` schemas	Growth	Agent state branching is a Growth+ feature

If the organization's tier is below the required minimum, the query is rejected with a tier_required error that includes the required tier name. This allows agents to surface actionable upgrade prompts to their users.

The tier gate is disabled by default (HATIDATA_TIER_GATE_ENABLED=false) and is intended for multi-tenant control plane deployments that enforce commercial tier boundaries. Self-hosted and single-tenant deployments typically leave this disabled.

Schema Isolation

In multi-tenant deployments, each organization's data resides in a dedicated schema namespace. The schema isolation stage prevents cross-tenant data access by validating that all tables referenced in a query belong to the same organization's schema.

If a query contains a JOIN (or subquery) that references tables from two different schema namespaces, it is rejected with a schema_violation error. This prevents both accidental and intentional cross-tenant data leakage, even if an agent somehow obtains valid credentials for multiple organizations.

Schema isolation is disabled by default (HATIDATA_TENANT_ISOLATION_ENABLED=false). It should be enabled in any deployment where multiple organizations share a single data plane instance. Single-tenant deployments (one proxy per customer) do not need this check because the entire engine instance belongs to a single organization.

Implementation note: Schema isolation runs after table extraction and before row-level security injection. This ordering ensures that cross-schema JOINs are caught before any RLS predicates are injected — RLS injection assumes all tables in the query belong to the same tenant.

Policy Check

The policy check evaluates whether this agent is permitted to run this query against these tables. Three layers are evaluated in order:

RBAC (Role-Based Access Control): The agent's JWT or API key maps to a role. The role determines which tables are readable, which are writable, and whether DML statements are permitted. If any referenced table is not accessible under the agent's role, the query is rejected here with a permission_denied error.

ABAC (Attribute-Based Access Control): Fine-grained rules evaluate attributes of the request — agent ID, time of day, IP network, data classification, query type, and others. ABAC can restrict access beyond what RBAC allows (for example, permitting reads from a table only during business hours, or only from agents whose purpose attribute matches analytics).

Column-level policy: The policy bundle includes a per-table list of columns that require masking. These are recorded for use in the post-processing stage (column masking) rather than causing rejection — the query is permitted but sensitive column values will be modified before they reach the agent.

If the policy check rejects a query, no quota is consumed and no audit record is written for the failed query content (only the rejection event itself).

Cost Estimation

Before executing, the pipeline computes a credit weight for the query. This weight is used in the quota check to enforce limits and in metering to record actual consumption.

Factor	Credit weight
Base query	1.0
Each additional table referenced	+0.5
Full table scan (no WHERE clause on indexed column)	+2.0
`SELECT *` (all columns)	+1.0
Result set exceeds 10,000 rows (per 10K rows)	+1.0

The cost estimator uses catalog statistics (row counts, column cardinality) to project whether a query is likely to produce a large result set. For queries with no statistics available (e.g., fresh tables), the estimator defaults to the base weight plus a moderate full-scan penalty.

Cost estimation is intentionally conservative: it is better to overestimate and let agents manage their quota budget than to underestimate and allow runaway queries to exhaust credits unexpectedly.

Quota Check

The estimated credit weight is checked against the agent's remaining quota for the current billing period. HatiData enforces two quota tiers:

Per-agent quota: Each agent identity (identified by the agent_id claim in its JWT) has an individual credit allowance. This prevents a single agent from monopolizing shared resources.
Per-org quota: The organization's total monthly credit limit, which bounds aggregate consumption across all agents.

If either quota would be exceeded by the estimated cost, the query is rejected with a quota_exceeded error before execution. The agent receives the estimated cost and remaining balance in the error response so it can decide whether to simplify the query.

Quota checks are non-atomic by design: under high concurrency, two agents could each be permitted to run a query that together exceeds the remaining quota. HatiData resolves this by enforcing hard limits during metering after execution, and flagging the organization for review rather than interrupting an in-flight query.

Processing

Row Filter Injection

Row-level security (RLS) predicates are injected into the query AST at this stage. The pipeline rewrites the SQL to add WHERE conditions that restrict which rows the agent can see, based on the RLS rules in its policy bundle.

-- Agent submits:
SELECT order_id, amount FROM orders WHERE status = 'pending'

-- After RLS injection (agent is scoped to tenant_id = 'acme'):
SELECT order_id, amount FROM orders
WHERE status = 'pending'
  AND tenant_id = 'acme'

RLS injection happens before transpilation so that the injected predicates are treated as part of the logical query — the transpiler, snapshot pin, and cost optimizer all see the restricted form.

Multi-table queries have RLS predicates injected for each referenced table independently, including tables inside subqueries and CTEs.

Vector Embedding

If the SQL contains a __HATI_EMBED__('...') marker, this stage replaces it with the corresponding embedding vector before transpilation. The marker is generated by agent-native features (memory search, semantic triggers) to inject vector similarity operations into standard SQL.

The embedding call is dispatched to the configured embedding provider (HATIDATA_EMBEDDING_PROVIDER). If the provider is unavailable, the marker is left in place and the query fails at execution with a descriptive error rather than silently returning incorrect results.

This stage is a no-op for queries that do not contain the __HATI_EMBED__ marker.

Transpilation

Agents submit SQL using standard Snowflake-compatible syntax. The transpiler rewrites the agent's SQL (post-RLS injection) into HatiData's native execution dialect.

Transpilation covers four categories of rewrites:

Function rewrites — Snowflake functions like NVL, IFF, DATEDIFF, DATEADD, LISTAGG, and others are automatically mapped to their native equivalents. See SQL Functions & Types for the full compatibility reference.
Type rewrites — Snowflake types such as VARIANT, TIMESTAMP_NTZ, and NUMBER are mapped to native types.
Construct rewrites — Higher-level SQL constructs like QUALIFY, FLATTEN, PIVOT/UNPIVOT, and SAMPLE are rewritten to their native equivalents. Unsupported constructs return structured errors with suggestions.
Pre-parse normalization — Snowflake colon notation (col:field) is converted before parsing; whitespace and quoting are normalized to maximize transpilation cache hit rates.

After all rewrites, the transpiler attaches a confidence score (0.0–1.0) to the output. Scores below 0.85 indicate the transpiler encountered constructs it rewrote heuristically. These are logged to the audit trail. The AI heal stage uses this score to decide whether to attempt repair if execution fails.

Transpilation results are cached by SQL hash — repeated queries skip the entire parse-and-rewrite pipeline.

Snapshot Pin

Before execution, the pipeline pins a point-in-time snapshot for the tables referenced in the query. This ensures that concurrent writes to those tables during query execution do not produce inconsistent results.

Each query gets a consistent view of the data as it existed at the moment the snapshot was pinned. Snapshot pinning uses a RwLock to coordinate with concurrent write operations: readers acquire a shared lock (non-blocking when no write is in progress), while Parquet flush and WAL replay operations acquire an exclusive lock. This guarantees that no query observes a partially-flushed state.

Long-running queries (over 60 seconds by default) receive a warning in the audit log, as their snapshot may be significantly behind the current state.

For agent branching workflows, the snapshot pin is applied to the branch environment rather than the main environment. See Branch Isolation for details.

Execution

Query Engine

The transpiled, RLS-filtered SQL executes against HatiData's embedded columnar engine. The engine runs in-process within the proxy — no network hop, no separate process to coordinate with.

The columnar execution engine is well-suited to the analytical query patterns agents typically issue: aggregations over large row sets, window functions, joins across multiple tables. Typical first-row latency for queries against cached data is under 5 milliseconds.

Dirty Tracking + WAL Append

For write operations (INSERT, UPDATE, DELETE), this stage records the mutation in the write-ahead log (WAL) and marks the affected tables as dirty for the next Parquet flush cycle.

The WAL append is synchronous when HATIDATA_WAL_FSYNC=true (the default): the stage does not return until the WAL entry has been durably written to disk. This guarantees that acknowledged writes survive a crash. When fsync is disabled, the WAL append is buffered by the OS, trading durability for higher write throughput.

Dirty tracking uses a per-table bitset that the periodic Parquet flush reads and clears. Only tables that have been modified since the last flush are exported, minimizing flush I/O for workloads that write to a small number of tables.

This stage is a no-op for read-only queries (SELECT).

AI Heal

If the query engine returns an execution error, the AI heal stage attempts automatic repair before surfacing the error to the agent.

The healer examines the error message, the original agent SQL, and the transpiled SQL. It uses a language model call (configurable provider) to generate a corrected version of the transpiled SQL. If the correction is produced and passes a syntax check, the pipeline loops back to the execution stage with the corrected SQL.

Heal attempts are capped at 2 per query. Both attempts and their outcomes are recorded in the audit trail. If healing fails or is disabled, the original error is returned to the agent with the transpilation confidence score included in the error metadata — this helps agents understand whether a low-confidence transpilation may have introduced the problem.

Post-Processing

Column Masking

After execution produces a result set, column masking is applied before results are returned. The masking rules determined in the security stage (policy check) are applied here.

Two masking strategies are available per column:

Redact: Replace the column value with a fixed placeholder (***REDACTED***). Used for PII fields where even a transformed value would be sensitive.
Hash: Replace the value with a deterministic cryptographic hash (keyed per organization). Agents can check equality between hashed values — two rows with the same original value will have the same hash — but cannot recover the original.

Masking is applied at the result row level before serialization. The query engine never executes queries with masking predicates injected — masking is always a post-execution transform. This prevents agents from inferring masked values through timing attacks on conditional expressions.

Metering

After results are produced, the actual credit cost of the query is recorded. The meter computes the final cost using the same formula as the estimator, but using actual row counts from the result set rather than projections.

The actual cost is recorded to the control plane's metering store and deducted from both the per-agent and per-org quota balances. If the actual cost exceeds the estimated cost (for example, because a query produced more rows than the estimator projected), the excess is still charged.

Per-agent metering is the foundation of HatiData's fair-use model: agents that issue expensive queries consume more of their individual budget, regardless of what other agents in the same organization are doing.

Audit

The final stage writes an immutable audit record. Every query — successful or rejected, healed or errored — produces an audit record.

Each audit record is linked to the previous record for that agent session via a cryptographic hash chain. The chain is initialized with a session seed and each subsequent record includes the hash of the previous record in its own hash input. This means any tampering with a record in the middle of the chain is detectable by re-verifying the chain from the seed forward.

Audit records include:

Agent ID and organization ID
Original SQL (from the agent, before any rewrites)
Transpiled SQL (after rewrites)
Transpilation confidence score
Policy decision (permitted / rejected, role, ABAC attributes evaluated)
Execution outcome (success / error, row count, execution time)
Column masking actions taken
Credit cost (estimated and actual)
AI heal attempts and outcomes
Timestamp (nanosecond precision)
Previous record hash (for chain verification)

See Audit Guarantees for the full specification of tamper detection and compliance export formats.

Two-Plane Model — Where in the architecture the pipeline runs
Concurrency Model — How multiple queries share the pipeline
Security Model — RBAC, ABAC, RLS, and column masking in depth
Cost Model — Credit weights, quota enforcement, and billing integration
Audit Guarantees — Hash-chain tamper detection and compliance exports
Branch Isolation — How branching interacts with snapshot pinning

Pre-Admission​

Startup Gate​

Auto-Resume​

Org Status Check​

Resource Governor​

Admission​

Concurrency Gate​

Security​

Table Extraction​

Append-Only Enforcement​

Tier Feature Gate​

Schema Isolation​

Policy Check​

Cost Estimation​

Quota Check​

Processing​

Row Filter Injection​

Vector Embedding​

Transpilation​

Snapshot Pin​

Execution​

Query Engine​

Dirty Tracking + WAL Append​

AI Heal​

Post-Processing​

Column Masking​

Metering​

Audit​

Related Concepts​

Stay in the loop