Skip to main content

dbt Adapter

HatiData provides a native dbt adapter (dbt-hatidata) that lets you run dbt models directly against your agent data layer. Models are compiled by dbt and executed through HatiData's full query pipeline — including policy evaluation, audit logging, and SQL auto-transpilation. You can also use the standard dbt-postgres adapter since HatiData exposes a PostgreSQL wire-protocol interface.

Installation

The dbt-hatidata adapter adds HatiData-specific features including automatic SQL transpilation hints, cost estimation in dbt logs, and agent-aware audit attribution.

pip install dbt-hatidata

Requirements: dbt-core >= 1.5.0, Python >= 3.9

Option B: dbt-postgres Adapter

If you prefer the standard PostgreSQL adapter:

pip install dbt-postgres

Both adapters use identical profiles.yml syntax. Switch between them by changing the type field.


profiles.yml Configuration

Add a HatiData output to your ~/.dbt/profiles.yml:

~/.dbt/profiles.yml
hatidata:
target: dev
outputs:

dev:
type: hatidata # or "postgres" for dbt-postgres adapter
host: localhost
port: 5439
user: admin
password: "{{ env_var('HATIDATA_API_KEY') }}"
dbname: hatidata
schema: public
threads: 4
connect_timeout: 30
sslmode: disable # local dev — no SSL needed

prod:
type: hatidata
host: "{{ env_var('HATIDATA_HOST') }}"
port: 5439
user: "{{ env_var('HATIDATA_USER') }}"
password: "{{ env_var('HATIDATA_API_KEY') }}"
dbname: hatidata
schema: "{{ env_var('DBT_SCHEMA', 'analytics') }}"
threads: 8
connect_timeout: 30
sslmode: require
keepalives_idle: 0

Set environment variables:

export HATIDATA_API_KEY=hd_live_your_api_key
export HATIDATA_HOST=your-org.proxy.hatidata.com
export HATIDATA_USER=dbt-service-account

Project Setup

Initialize a new dbt project pointing to HatiData:

dbt init my_project
# When prompted for database type, enter: hatidata (or postgres)

Or configure an existing project's dbt_project.yml:

dbt_project.yml
name: my_project
version: 1.0.0
profile: hatidata

models:
my_project:
+materialized: view # views are created in HatiData's schema
staging:
+schema: staging
marts:
+schema: analytics
+materialized: table

Running Models

# Test connection
dbt debug

# Run all models
dbt run

# Run a specific model
dbt run --select orders_summary

# Run a model and all its downstream dependencies
dbt run --select orders_summary+

# Run models in a specific schema
dbt run --select staging.*

# Build (run + test)
dbt build

# Run tests only
dbt test

# Generate docs
dbt docs generate
dbt docs serve

Writing Models

dbt models compile to SQL that runs through HatiData's query pipeline. Both standard SQL and legacy dialect syntax (NVL, IFF, DATEDIFF, DATEADD) are auto-transpiled:

models/staging/stg_orders.sql
-- HatiData auto-transpiles NVL, IFF, and DATEDIFF
SELECT
id AS order_id,
customer_id,
NVL(discount_code, 'NONE') AS discount_code,
IFF(total > 1000, 'high', 'standard') AS value_tier,
DATEDIFF('day', created_at, shipped_at) AS days_to_ship,
created_at
FROM {{ source('raw', 'orders') }}
WHERE created_at >= DATEADD('month', -3, CURRENT_DATE)
models/marts/customer_ltv.sql
WITH orders AS (
SELECT * FROM {{ ref('stg_orders') }}
),
customers AS (
SELECT * FROM {{ source('raw', 'customers') }}
)
SELECT
c.customer_id,
c.name,
c.segment,
COUNT(o.order_id) AS total_orders,
SUM(o.total) AS lifetime_value,
AVG(o.total) AS avg_order_value,
MIN(o.created_at) AS first_order_at,
MAX(o.created_at) AS last_order_at
FROM customers c
LEFT JOIN orders o USING (customer_id)
GROUP BY 1, 2, 3

Auto-Transpile Option

The native dbt-hatidata adapter exposes an auto_transpile connection option that controls SQL transpilation behavior:

~/.dbt/profiles.yml
hatidata:
outputs:
prod:
type: hatidata
host: your-org.proxy.hatidata.com
port: 5439
# ...
auto_transpile: true # default: true
transpile_dialect: snowflake # hint for transpiler: snowflake | bigquery | redshift

When auto_transpile: true, the HatiData proxy automatically detects and transpiles legacy dialect SQL. Set transpile_dialect to help the transpiler when queries use dialect-specific syntax that is ambiguous.


Source Declarations

Define raw tables as dbt sources:

models/staging/schema.yml
version: 2

sources:
- name: raw
database: hatidata
schema: public
tables:
- name: orders
description: "Raw orders table loaded by the ingestion pipeline"
columns:
- name: id
description: "Primary key"
tests:
- unique
- not_null
- name: customer_id
tests:
- not_null
- name: total
tests:
- not_null

models:
- name: stg_orders
description: "Cleaned and standardized orders"
columns:
- name: order_id
tests:
- unique
- not_null

Testing with Local Mode

For local development and CI, run dbt against a local HatiData instance:

# Start local proxy (no cloud credentials needed)
hati dev

# Use the dev profile
dbt run --target dev
dbt test --target dev

Local mode runs entirely on your machine with no cloud dependencies. The same SQL models and tests run identically in local and production environments.

For CI pipelines:

.github/workflows/dbt-ci.yml
- name: Run dbt tests
env:
HATIDATA_API_KEY: ${{ secrets.HATIDATA_TEST_KEY }}
HATIDATA_HOST: ci.proxy.hatidata.com
run: |
pip install dbt-hatidata
dbt deps
dbt run --target prod
dbt test --target prod

Materializations

HatiData supports dbt's standard materializations:

MaterializationBehavior
viewCreates a view in the data layer (default)
tableCreates a table via CREATE TABLE AS SELECT
incrementalAppends or updates rows based on a unique key or timestamp
ephemeralCTE inline — no object created in the data layer
CREATE TABLE Support

HatiData supports CREATE TABLE AS SELECT for dbt table and incremental materializations. Standard BI tools using HatiData as a read-only query endpoint do not require this capability.


Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.