SOC 2 Architecture
HatiData's architecture is designed for SOC 2 Type II from the ground up. Because the data plane runs inside your VPC, the most sensitive controls — data access, encryption key custody, and network isolation — are owned and operated by you, not by HatiData. This page maps every Trust Service Criterion to a concrete architectural control and answers common CISO questions.
Trust Service Criteria Mapping
CC6 — Security (Logical and Physical Access)
| Control | HatiData Implementation |
|---|---|
| CC6.1 — Least-privilege access | 6 RBAC roles (ServiceAccount, Developer, Analyst, Auditor, Admin, Owner). Every role is scoped to the minimum permissions required. ServiceAccount cannot perform admin actions; Auditor has read-only audit log access only. |
| CC6.2 — User provisioning and de-provisioning | Users are managed via the control plane API. De-provisioned users lose access immediately. All provisioning events are recorded in the IAM audit trail with before/after values. |
| CC6.3 — Authentication mechanisms | JWT (RS256, 1-hour expiry), API keys (Argon2id-hashed, per-environment scope), federated auth (AWS STS, Azure AD, Google Cloud Identity). MFA enforcement is configurable organization-wide. |
| CC6.6 — Transmission encryption | TLS 1.3 only. No downgrade negotiation. Port 5439 (Postgres wire protocol) and the control plane API both enforce TLS 1.3 termination. |
| CC6.7 — Encryption at rest | AES-256 via CMEK (AWS KMS, GCP Cloud KMS, or Azure Key Vault). Local SSD query cache is encrypted with LUKS AES-256-XTS, keyed from the customer KMS at instance boot. |
| CC6.8 — Change management | Infrastructure is deployed via versioned Terraform modules. Policy changes are recorded in the IAM audit trail and are hash-chained for tamper detection. |
CC7 — Availability
| Control | HatiData Implementation |
|---|---|
| CC7.1 — System monitoring | Prometheus metrics exported on port 9090. Default alerts for latency, error rate, cache saturation, and query quota utilization. |
| CC7.2 — Capacity management | Auto Scaling Groups scale between configured min/max instance counts based on CPU and connection metrics. |
| CC7.5 — Backup and recovery | Customer data lives in S3 (11 nines durability). The proxy is stateless — Auto Scaling Group replaces failed instances within 5 minutes. The local SSD cache rebuilds from S3 on first access after replacement. |
CC8 — Processing Integrity
| Control | HatiData Implementation |
|---|---|
| CC8.1 — Inputs processed completely and accurately | The multi-stage query pipeline enforces: policy check → cost estimate → quota check → row filter → transpile → execute → column mask → meter → audit. Every step either succeeds or returns a structured error — partial results are never returned. |
| RLS injection | Row-level security WHERE clauses are injected at the SQL AST level before execution. Injection is mandatory and cannot be bypassed by query syntax. |
| Quota enforcement | Monthly credit limits are checked before execution. Queries that would exceed the quota are rejected with a QUOTA_EXCEEDED error before any compute is consumed. |
CC9 — Confidentiality
| Control | HatiData Implementation |
|---|---|
| CC9.1 — Confidential information protection | Column masking operates at the proxy layer after query execution. Four masking functions: full redaction (***), partial redaction (last N chars), cryptographic hash, and NULL replacement. Role-based exemptions are policy-controlled. |
| CC9.2 — Sub-processor management | Cloud infrastructure providers only (AWS, GCP, or Azure). HatiData has no access to customer data. No additional sub-processors. |
| Tenant isolation | All queries are automatically scoped to the authenticated organization via injected WHERE clauses. Cross-tenant JOIN queries are detected and blocked at the AST level. |
P1–P8 — Privacy
| Control | HatiData Implementation |
|---|---|
| P1 — Privacy notice | Data Processing Agreement (DPA) defines HatiData's role as a data processor. Available on request. |
| P3 — Collection limitation | HatiData collects only anonymized usage metrics (query count, latency distributions, credit consumption). No query content, results, or schema data is collected. |
| P6 — Use, retention, and disposal | At contract termination, compute resources are destroyed, local SSD caches are cryptographically erased with cryptsetup luksErase, and no data is retained. Customer data remains in the customer's S3 bucket. |
| P8 — Monitoring and enforcement | PII patterns (email, SSN, credit card, phone) are automatically detected and redacted from audit log SQL text before writing to storage. Redaction uses compiled regular expressions for sub-millisecond throughput. |
GDPR Alignment
HatiData's architecture maps to GDPR's core requirements as follows:
| GDPR Requirement | Implementation |
|---|---|
| Data residency | Region is selected at provisioning time. All compute, caching, and audit storage remain within that region. No cross-region data transfer occurs unless explicitly configured (Enterprise only). |
| Data processor role | HatiData acts as a data processor per the DPA. Customers (controllers) retain full control over data subject rights because HatiData has no access to the underlying data. |
| Right to erasure | Because all data is in the customer's S3 bucket, customers execute erasure directly. HatiData imposes no technical obstacles to deletion. |
| Breach notification | Customer notification within 72 hours of HatiData becoming aware of any incident. Post-Incident Review shared within 72 hours for P1/P2 events. |
| Data minimization | Only anonymized billing metrics are transmitted to HatiData infrastructure. No query content or personal data crosses VPC boundaries. |
HIPAA Alignment
HatiData supports HIPAA-covered entity deployments with a Business Associate Agreement (BAA). Key controls:
| HIPAA Requirement | Implementation |
|---|---|
| PHI access controls | RBAC + ABAC restrict which users and agents can query PHI tables. JIT (just-in-time) access grants provide time-bounded elevation for specific users without permanent role promotion. |
| PHI field masking | Column masking rules can redact PHI columns (e.g., patient_name, dob, diagnosis_code) for all roles except explicitly exempted ones. |
| Audit controls (§164.312(b)) | Hash-chained, append-only audit log records every query with user, agent, tables accessed, rows returned, columns masked, and policy verdicts. Retention is configurable (minimum 7 years). |
| Encryption (§164.312(a)(2)(iv)) | CMEK at rest, TLS 1.3 in transit, LUKS AES-256-XTS for local SSD cache. Encryption is mandatory — there is no option to disable it. |
| Data location | PHI never leaves the customer's VPC. HatiData infrastructure has no network path to customer data. |
CISO FAQ
Q: Is HatiData SOC 2 Type II certified? A: HatiData's architecture is designed for SOC 2 Type II. Because the data plane runs in the customer's VPC, the customer's own cloud provider certifications (AWS, GCP, or Azure) apply to the infrastructure layer. Contact security@hatidata.com for the current compliance posture and available attestation documents.
Q: Can HatiData employees access my data? A: No. The query engine runs in your VPC under your IAM roles. HatiData has no credentials, no cross-account access, and no SSH or SSM access to customer instances or storage.
Q: Does HatiData perform penetration testing? A: The architecture is designed for customer-initiated penetration testing. All components run in the customer's VPC, so customers can test their own deployment without coordination with HatiData.
Q: Who are HatiData's sub-processors? A: Cloud infrastructure providers only — AWS, GCP, or Azure, depending on the selected deployment. No other third parties have access to customer data or metadata.
Q: What is the patching cadence for security vulnerabilities? A: Critical (CVSS 9+): 24 hours. High (CVSS 7–8.9): 7 days. Medium (CVSS 4–6.9): 30 days. Low: next scheduled release. Updates are distributed as new AMIs or container images via the release channel.
Q: What security questionnaire format does HatiData support? A: A pre-filled questionnaire in SIG Lite / CAIQ v4 format is available on request. Contact security@hatidata.com.
Q: Is a DPA available? A: Yes. A standard Data Processing Agreement is available on request, covering GDPR Article 28 requirements. Custom DPA terms are available for Enterprise customers.
Related Concepts
- Audit Guarantees — Hash-chained log details, schema, and export formats
- CMEK & Encryption — Key management, TLS 1.3, and local SSD LUKS
- Data Residency — Region selection and data sovereignty
- Security Model — RBAC roles, ABAC policy evaluation, column masking, and row-level security