Skip to main content

SOC 2 Architecture

HatiData's architecture is designed for SOC 2 Type II from the ground up. Because the data plane runs inside your VPC, the most sensitive controls — data access, encryption key custody, and network isolation — are owned and operated by you, not by HatiData. This page maps every Trust Service Criterion to a concrete architectural control and answers common CISO questions.


Trust Service Criteria Mapping

CC6 — Security (Logical and Physical Access)

ControlHatiData Implementation
CC6.1 — Least-privilege access6 RBAC roles (ServiceAccount, Developer, Analyst, Auditor, Admin, Owner). Every role is scoped to the minimum permissions required. ServiceAccount cannot perform admin actions; Auditor has read-only audit log access only.
CC6.2 — User provisioning and de-provisioningUsers are managed via the control plane API. De-provisioned users lose access immediately. All provisioning events are recorded in the IAM audit trail with before/after values.
CC6.3 — Authentication mechanismsJWT (RS256, 1-hour expiry), API keys (Argon2id-hashed, per-environment scope), federated auth (AWS STS, Azure AD, Google Cloud Identity). MFA enforcement is configurable organization-wide.
CC6.6 — Transmission encryptionTLS 1.3 only. No downgrade negotiation. Port 5439 (Postgres wire protocol) and the control plane API both enforce TLS 1.3 termination.
CC6.7 — Encryption at restAES-256 via CMEK (AWS KMS, GCP Cloud KMS, or Azure Key Vault). Local SSD query cache is encrypted with LUKS AES-256-XTS, keyed from the customer KMS at instance boot.
CC6.8 — Change managementInfrastructure is deployed via versioned Terraform modules. Policy changes are recorded in the IAM audit trail and are hash-chained for tamper detection.

CC7 — Availability

ControlHatiData Implementation
CC7.1 — System monitoringPrometheus metrics exported on port 9090. Default alerts for latency, error rate, cache saturation, and query quota utilization.
CC7.2 — Capacity managementAuto Scaling Groups scale between configured min/max instance counts based on CPU and connection metrics.
CC7.5 — Backup and recoveryCustomer data lives in S3 (11 nines durability). The proxy is stateless — Auto Scaling Group replaces failed instances within 5 minutes. The local SSD cache rebuilds from S3 on first access after replacement.

CC8 — Processing Integrity

ControlHatiData Implementation
CC8.1 — Inputs processed completely and accuratelyThe multi-stage query pipeline enforces: policy check → cost estimate → quota check → row filter → transpile → execute → column mask → meter → audit. Every step either succeeds or returns a structured error — partial results are never returned.
RLS injectionRow-level security WHERE clauses are injected at the SQL AST level before execution. Injection is mandatory and cannot be bypassed by query syntax.
Quota enforcementMonthly credit limits are checked before execution. Queries that would exceed the quota are rejected with a QUOTA_EXCEEDED error before any compute is consumed.

CC9 — Confidentiality

ControlHatiData Implementation
CC9.1 — Confidential information protectionColumn masking operates at the proxy layer after query execution. Four masking functions: full redaction (***), partial redaction (last N chars), cryptographic hash, and NULL replacement. Role-based exemptions are policy-controlled.
CC9.2 — Sub-processor managementCloud infrastructure providers only (AWS, GCP, or Azure). HatiData has no access to customer data. No additional sub-processors.
Tenant isolationAll queries are automatically scoped to the authenticated organization via injected WHERE clauses. Cross-tenant JOIN queries are detected and blocked at the AST level.

P1–P8 — Privacy

ControlHatiData Implementation
P1 — Privacy noticeData Processing Agreement (DPA) defines HatiData's role as a data processor. Available on request.
P3 — Collection limitationHatiData collects only anonymized usage metrics (query count, latency distributions, credit consumption). No query content, results, or schema data is collected.
P6 — Use, retention, and disposalAt contract termination, compute resources are destroyed, local SSD caches are cryptographically erased with cryptsetup luksErase, and no data is retained. Customer data remains in the customer's S3 bucket.
P8 — Monitoring and enforcementPII patterns (email, SSN, credit card, phone) are automatically detected and redacted from audit log SQL text before writing to storage. Redaction uses compiled regular expressions for sub-millisecond throughput.

GDPR Alignment

HatiData's architecture maps to GDPR's core requirements as follows:

GDPR RequirementImplementation
Data residencyRegion is selected at provisioning time. All compute, caching, and audit storage remain within that region. No cross-region data transfer occurs unless explicitly configured (Enterprise only).
Data processor roleHatiData acts as a data processor per the DPA. Customers (controllers) retain full control over data subject rights because HatiData has no access to the underlying data.
Right to erasureBecause all data is in the customer's S3 bucket, customers execute erasure directly. HatiData imposes no technical obstacles to deletion.
Breach notificationCustomer notification within 72 hours of HatiData becoming aware of any incident. Post-Incident Review shared within 72 hours for P1/P2 events.
Data minimizationOnly anonymized billing metrics are transmitted to HatiData infrastructure. No query content or personal data crosses VPC boundaries.

HIPAA Alignment

HatiData supports HIPAA-covered entity deployments with a Business Associate Agreement (BAA). Key controls:

HIPAA RequirementImplementation
PHI access controlsRBAC + ABAC restrict which users and agents can query PHI tables. JIT (just-in-time) access grants provide time-bounded elevation for specific users without permanent role promotion.
PHI field maskingColumn masking rules can redact PHI columns (e.g., patient_name, dob, diagnosis_code) for all roles except explicitly exempted ones.
Audit controls (§164.312(b))Hash-chained, append-only audit log records every query with user, agent, tables accessed, rows returned, columns masked, and policy verdicts. Retention is configurable (minimum 7 years).
Encryption (§164.312(a)(2)(iv))CMEK at rest, TLS 1.3 in transit, LUKS AES-256-XTS for local SSD cache. Encryption is mandatory — there is no option to disable it.
Data locationPHI never leaves the customer's VPC. HatiData infrastructure has no network path to customer data.

CISO FAQ

Q: Is HatiData SOC 2 Type II certified? A: HatiData's architecture is designed for SOC 2 Type II. Because the data plane runs in the customer's VPC, the customer's own cloud provider certifications (AWS, GCP, or Azure) apply to the infrastructure layer. Contact security@hatidata.com for the current compliance posture and available attestation documents.

Q: Can HatiData employees access my data? A: No. The query engine runs in your VPC under your IAM roles. HatiData has no credentials, no cross-account access, and no SSH or SSM access to customer instances or storage.

Q: Does HatiData perform penetration testing? A: The architecture is designed for customer-initiated penetration testing. All components run in the customer's VPC, so customers can test their own deployment without coordination with HatiData.

Q: Who are HatiData's sub-processors? A: Cloud infrastructure providers only — AWS, GCP, or Azure, depending on the selected deployment. No other third parties have access to customer data or metadata.

Q: What is the patching cadence for security vulnerabilities? A: Critical (CVSS 9+): 24 hours. High (CVSS 7–8.9): 7 days. Medium (CVSS 4–6.9): 30 days. Low: next scheduled release. Updates are distributed as new AMIs or container images via the release channel.

Q: What security questionnaire format does HatiData support? A: A pre-filled questionnaire in SIG Lite / CAIQ v4 format is available on request. Contact security@hatidata.com.

Q: Is a DPA available? A: Yes. A standard Data Processing Agreement is available on request, covering GDPR Article 28 requirements. Custom DPA terms are available for Enterprise customers.


Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.