Deployment & Monitoring
This guide covers HatiData's CI/CD pipeline, automated testing gates, and production health monitoring.
CI/CD Pipeline
Every push to main triggers the following stages:
- Lint — Rust
fmt --check+clippyacross all crates - Rust tests — Full test suite run against 2 feature flag variants
- Dashboard build + lint — React production build and ESLint pass
- E2E Smoke tests — 23 Playwright tests against the dashboard with mocked APIs
- Docker build — Container images for proxy, control-plane, and dashboard
- Deploy to DevBox — Automated deploy to the dev environment
- Auto-deploy to Preprod — Promoted to
preprod.hatidata.comon success
E2E Smoke Gate
23 Playwright tests exercise critical dashboard flows: login, navigation, SQL Editor execution, agent fleet rendering, memory search, and CoT replay verification. These tests run against the built dashboard with mocked API responses.
Deploys are blocked if any smoke test fails. The CI pipeline will not proceed to the Docker build or deploy stages until all 23 tests pass.
Post-Deploy Integration Tests
After every preprod deploy, 6 integration checks run against the live environment:
| Check | What it verifies |
|---|---|
| API key validation | Control Plane accepts a valid API key and rejects invalid ones |
| MCP Server | MCP endpoint responds and reports 20+ registered tools |
| Control Plane health | /health returns 200 with version info |
| Dashboard accessible | Dashboard loads and returns a valid HTML document |
| Clerk webhook endpoint | Webhook URL is reachable and returns the expected signature challenge |
| Data integrity | DuckDB table count matches or exceeds the pre-deploy snapshot |
Health Monitor
A scheduled health check runs every 12 hours via cron. It verifies:
- Control Plane health endpoint responds within 5 seconds
- Dashboard returns a 200 status code
- Clerk JWKS endpoint is reachable (JWT verification depends on this)
- Clerk webhook endpoint responds correctly
On failure, an email alert is sent via SendGrid to the configured operations contact.
Data Integrity
Data integrity is verified on every deploy using a before/after table count comparison:
- Pre-deploy: Snapshot the current DuckDB table count.
- Deploy: Run the standard deploy pipeline.
- Post-deploy: Query the table count again and compare.
If the post-deploy count is less than the pre-deploy count, the deploy is marked FAILED with a DATA LOSS DETECTED warning. The operations team is alerted immediately and the deploy is flagged for manual investigation.