SOC 2 Architecture

HatiData's architecture is designed for SOC 2 Type II from the ground up. Because the data plane runs inside your VPC, the most sensitive controls — data access, encryption key custody, and network isolation — are owned and operated by you, not by HatiData. This page maps every Trust Service Criterion to a concrete architectural control and answers common CISO questions.

Trust Service Criteria Mapping

CC6 — Security (Logical and Physical Access)

Control	HatiData Implementation
CC6.1 — Least-privilege access	6 RBAC roles (ServiceAccount, Developer, Analyst, Auditor, Admin, Owner). Every role is scoped to the minimum permissions required. ServiceAccount cannot perform admin actions; Auditor has read-only audit log access only.
CC6.2 — User provisioning and de-provisioning	Users are managed via the control plane API. De-provisioned users lose access immediately. All provisioning events are recorded in the IAM audit trail with before/after values.
CC6.3 — Authentication mechanisms	JWT (RS256, 1-hour expiry), API keys (Argon2id-hashed, per-environment scope), federated auth (AWS STS, Azure AD, Google Cloud Identity). MFA enforcement is configurable organization-wide.
CC6.6 — Transmission encryption	TLS 1.3 only. No downgrade negotiation. Port 5439 (Postgres wire protocol) and the control plane API both enforce TLS 1.3 termination.
CC6.7 — Encryption at rest	AES-256 via CMEK (AWS KMS, GCP Cloud KMS, or Azure Key Vault). Local SSD query cache is encrypted with LUKS AES-256-XTS, keyed from the customer KMS at instance boot.
CC6.8 — Change management	Infrastructure is deployed via versioned Terraform modules. Policy changes are recorded in the IAM audit trail and are hash-chained for tamper detection.

CC7 — Availability

Control	HatiData Implementation
CC7.1 — System monitoring	Prometheus metrics exported on port 9090. Default alerts for latency, error rate, cache saturation, and query quota utilization.
CC7.2 — Capacity management	Auto Scaling Groups scale between configured min/max instance counts based on CPU and connection metrics.
CC7.5 — Backup and recovery	Customer data lives in S3 (11 nines durability). The proxy is stateless — Auto Scaling Group replaces failed instances within 5 minutes. The local SSD cache rebuilds from S3 on first access after replacement.

CC8 — Processing Integrity

Control	HatiData Implementation
CC8.1 — Inputs processed completely and accurately	The multi-stage query pipeline enforces: policy check → cost estimate → quota check → row filter → transpile → execute → column mask → meter → audit. Every step either succeeds or returns a structured error — partial results are never returned.
RLS injection	Row-level security WHERE clauses are injected at the SQL AST level before execution. Injection is mandatory and cannot be bypassed by query syntax.
Quota enforcement	Monthly credit limits are checked before execution. Queries that would exceed the quota are rejected with a `QUOTA_EXCEEDED` error before any compute is consumed.

CC9 — Confidentiality

Control	HatiData Implementation
CC9.1 — Confidential information protection	Column masking operates at the proxy layer after query execution. Four masking functions: full redaction (`***`), partial redaction (last N chars), cryptographic hash, and NULL replacement. Role-based exemptions are policy-controlled.
CC9.2 — Sub-processor management	Cloud infrastructure providers only (AWS, GCP, or Azure). HatiData has no access to customer data. No additional sub-processors.
Tenant isolation	All queries are automatically scoped to the authenticated organization via injected WHERE clauses. Cross-tenant JOIN queries are detected and blocked at the AST level.

P1–P8 — Privacy

Control	HatiData Implementation
P1 — Privacy notice	Data Processing Agreement (DPA) defines HatiData's role as a data processor. Available on request.
P3 — Collection limitation	HatiData collects only anonymized usage metrics (query count, latency distributions, credit consumption). No query content, results, or schema data is collected.
P6 — Use, retention, and disposal	At contract termination, compute resources are destroyed, local SSD caches are cryptographically erased with `cryptsetup luksErase`, and no data is retained. Customer data remains in the customer's S3 bucket.
P8 — Monitoring and enforcement	PII patterns (email, SSN, credit card, phone) are automatically detected and redacted from audit log SQL text before writing to storage. Redaction uses compiled regular expressions for sub-millisecond throughput.

HatiData's architecture maps to GDPR's core requirements as follows:

GDPR Requirement	Implementation
Data residency	Region is selected at provisioning time. All compute, caching, and audit storage remain within that region. No cross-region data transfer occurs unless explicitly configured (Enterprise only).
Data processor role	HatiData acts as a data processor per the DPA. Customers (controllers) retain full control over data subject rights because HatiData has no access to the underlying data.
Right to erasure	Because all data is in the customer's S3 bucket, customers execute erasure directly. HatiData imposes no technical obstacles to deletion.
Breach notification	Customer notification within 72 hours of HatiData becoming aware of any incident. Post-Incident Review shared within 72 hours for P1/P2 events.
Data minimization	Only anonymized billing metrics are transmitted to HatiData infrastructure. No query content or personal data crosses VPC boundaries.

HIPAA Alignment

HatiData supports HIPAA-covered entity deployments with a Business Associate Agreement (BAA). Key controls:

HIPAA Requirement	Implementation
PHI access controls	RBAC + ABAC restrict which users and agents can query PHI tables. JIT (just-in-time) access grants provide time-bounded elevation for specific users without permanent role promotion.
PHI field masking	Column masking rules can redact PHI columns (e.g., `patient_name`, `dob`, `diagnosis_code`) for all roles except explicitly exempted ones.
Audit controls (§164.312(b))	Hash-chained, append-only audit log records every query with user, agent, tables accessed, rows returned, columns masked, and policy verdicts. Retention is configurable (minimum 7 years).
Encryption (§164.312(a)(2)(iv))	CMEK at rest, TLS 1.3 in transit, LUKS AES-256-XTS for local SSD cache. Encryption is mandatory — there is no option to disable it.
Data location	PHI never leaves the customer's VPC. HatiData infrastructure has no network path to customer data.

CISO FAQ

Q: Is HatiData SOC 2 Type II certified? A: HatiData's architecture is designed for SOC 2 Type II. Because the data plane runs in the customer's VPC, the customer's own cloud provider certifications (AWS, GCP, or Azure) apply to the infrastructure layer. Contact security@hatidata.com for the current compliance posture and available attestation documents.

Q: Can HatiData employees access my data? A: No. The query engine runs in your VPC under your IAM roles. HatiData has no credentials, no cross-account access, and no SSH or SSM access to customer instances or storage.

Q: Does HatiData perform penetration testing? A: The architecture is designed for customer-initiated penetration testing. All components run in the customer's VPC, so customers can test their own deployment without coordination with HatiData.

Q: Who are HatiData's sub-processors? A: Cloud infrastructure providers only — AWS, GCP, or Azure, depending on the selected deployment. No other third parties have access to customer data or metadata.

Q: What is the patching cadence for security vulnerabilities? A: Critical (CVSS 9+): 24 hours. High (CVSS 7–8.9): 7 days. Medium (CVSS 4–6.9): 30 days. Low: next scheduled release. Updates are distributed as new AMIs or container images via the release channel.

Q: What security questionnaire format does HatiData support? A: A pre-filled questionnaire in SIG Lite / CAIQ v4 format is available on request. Contact security@hatidata.com.

Q: Is a DPA available? A: Yes. A standard Data Processing Agreement is available on request, covering GDPR Article 28 requirements. Custom DPA terms are available for Enterprise customers.

Audit Guarantees — Hash-chained log details, schema, and export formats
CMEK & Encryption — Key management, TLS 1.3, and local SSD LUKS
Data Residency — Region selection and data sovereignty
Security Model — RBAC roles, ABAC policy evaluation, column masking, and row-level security

Trust Service Criteria Mapping​

CC6 — Security (Logical and Physical Access)​

CC7 — Availability​

CC8 — Processing Integrity​

CC9 — Confidentiality​

P1–P8 — Privacy​

GDPR Alignment​

HIPAA Alignment​

CISO FAQ​

Related Concepts​

Stay in the loop