Skip to main content

Security Overview

HatiData is built with a security-first architecture. All data processing occurs within the customer's VPC, ensuring complete data sovereignty and zero egress. This page provides a comprehensive overview of every security layer in HatiData.

Architecture Principle: Data Never Leaves Your VPC

Unlike traditional cloud warehouses that process data on shared infrastructure, HatiData deploys the query engine inside your VPC. The control plane communicates over AWS PrivateLink, and no public IP addresses are assigned to any resource.

Your VPC                              HatiData VPC
┌──────────────────────┐ ┌──────────────────┐
│ HatiData Proxy │ Private │ Control Plane │
│ (:5439) │◄── Link ───► │ (:8080) │
│ │ │ │
│ NVMe Cache (LUKS) │ │ Auth / Billing │
│ DuckDB Engine │ │ Policy Engine │
│ │ │ Audit Storage │
│ S3 (SSE-KMS) │ └──────────────────┘
└──────────────────────┘

Encryption

At Rest

  • Customer-Managed Encryption Keys (CMEK) via AWS KMS / GCP Cloud KMS / Azure Key Vault
  • NVMe SSD cache encrypted with LUKS (AES-256-XTS), key derived from customer KMS at boot
  • S3 / GCS / Azure Blob server-side encryption with KMS-managed keys (SSE-KMS)
  • Automatic key rotation enabled by default

In Transit

  • TLS 1.3 mandatory -- no fallback to TLS 1.2
  • Supported cipher suites:
    • TLS_AES_256_GCM_SHA384
    • TLS_AES_128_GCM_SHA256
    • TLS_CHACHA20_POLY1305_SHA256

RBAC (Role-Based Access Control)

Six predefined roles with least-privilege permissions:

RoleQueryManage PoliciesManage UsersBillingAudit Logs
OwnerYesYesYesYesYes
AdminYesYesYesNoYes
AnalystYesNoNoNoNo
AuditorNoNoNoNoYes
DeveloperYesNoNoNoNo
ServiceAccountYesNoNoNoNo

For detailed role definitions and permission matrices, see Authorization.

ABAC (Attribute-Based Access Control)

HatiData evaluates policies against a rich evaluation context built from the session, request, and environment:

Evaluation Context Attributes

AttributeTypeDescription
user_roleRoleRBAC role of the requester
user_idUUIDAuthenticated user identity
org_idUUIDOrganization scope
environmentStringTarget environment (production, staging, dev)
source_ipIpAddrClient IP address
query_originStringOrigin type (dashboard, api, agent, sdk)
agent_frameworkOption<String>Agent framework (langchain, crewai, custom)
agent_idOption<String>Unique agent identifier
time_of_dayNaiveTimeCurrent server time
day_of_weekWeekdayCurrent day

Rule Conditions

ConditionExample
QueryOriginIsBlock queries not from dashboard or sdk
AgentFrameworkIsAllow only langchain agents
TimeOfDayDeny queries outside business hours
DayOfWeekRead-only on weekends
AttributeEqualsMatch custom key-value attributes
LicenseTierIsGate features by tier (Free, Cloud, Growth, Enterprise)
ScopeRequiredRequire specific API key scope

For full details, see Authorization.

API Key Scopes

API keys use the format hd_live_[32 alphanumeric] (production) and hd_test_[32 alphanumeric] (staging). Keys are hashed with Argon2id before storage -- the plaintext is shown only once at creation time.

22 granular ApiScope variants are organized into scope bundles:

BundleIncluded Scopes
ReadOnlyquery:read, schema:read, audit:read
DeveloperReadOnly + query:write, schema:write, environment:read
AdminDeveloper + policy:*, user:*, key:*, webhook:*, billing:read
Agentquery:read, query:write, schema:read, agent:*

Keys support IP allowlisting and automatic rotation with a 72-hour grace period.

For details on key management, see API Keys.

Audit Logging

Every query and administrative action is logged to an immutable audit trail.

Query Audit

Each query audit entry captures:

  • Query ID, user, source IP
  • SQL text (PII-redacted)
  • Tables accessed, rows returned, columns masked
  • Execution time, cache hit status
  • Policy verdicts (allow/deny with reason)
  • Agent metadata (agent_id, framework)

Logs are stored in the customer's object storage bucket with Object Lock (7-year retention, Governance mode). Format: JSONL partitioned by date.

IAM Audit (Hash-Chained)

Administrative actions are recorded in a tamper-evident hash chain:

  • 27 event types covering policy CRUD, key rotation, user management, SSO configuration
  • SHA-256 chain verification -- each event references the hash of the previous event
  • Before/after values for change tracking
  • Chain integrity can be verified via the Audit API

Retention

TierHot StorageGlacierDeep Archive
Duration90 days90 days -- 1 year1 -- 7 years

Row-Level Security

Row-level security (RLS) injects WHERE clauses at the SQL AST level before execution. Filters support agent-aware placeholders:

-- Policy: agents can only see their own data
WHERE agent_id = '{agent_id}'

-- Policy: department-scoped access
WHERE department = '{department}'

-- Policy: organization isolation
WHERE org_id = '{org_id}'

Placeholders are resolved from the authenticated session context. See Data Protection for details.

Column Masking

Dynamic column masking is applied at the proxy layer after query execution. Masking functions:

FunctionOutputExample
Full***alice@example.com -> ***
PartialLast N chars visible4111111111111111 -> ***1111
HashSHA-256 digestalice@example.com -> a1b2c3...
NullNULL555-0100 -> NULL

Masking rules are role-based with agent-specific overrides. The underlying data is never modified.

JIT Access

Just-In-Time access provides time-bounded privilege escalation:

  • For humans: Temporary role elevation with configurable duration and automatic revocation
  • For agents: Structured AgentCapabilityGrant with table allowlists, query count limits, and expiration

All JIT grants are recorded in the IAM audit trail with full before/after tracking.

Tenant Isolation

Multi-tenant deployments enforce strict isolation:

  • Automatic WHERE org_id = '{org_id}' injection on every query
  • Cross-tenant JOIN prevention at the AST level
  • Parent-child organization hierarchy support
  • Per-tenant resource quotas

Federated Authentication

HatiData supports federation with cloud identity providers:

ProviderMethod
AWSSTS AssumeRoleWithWebIdentity
GCPWorkload Identity Federation
AzureManaged Identity + Azure AD

Federation tokens are cached in a DashMap with TTL-based expiration.

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.