dbt Adapter

The dbt-hatidata adapter lets you run dbt models against HatiData. It extends the Postgres adapter and connects via the standard wire protocol, with automatic SQL transpilation for Snowflake-compatible models.

Installation

pip install dbt-hatidata

Requirements: Python 3.9+, dbt-core 1.7+.

Configuration

profiles.yml

Add a HatiData target to your dbt profiles:

~/.dbt/profiles.yml
my_project:
  target: dev
  outputs:
    dev:
      type: hatidata
      host: "localhost"              # Or your-org.proxy.hatidata.com
      port: 5439
      user: "analyst"
      password: "{{ env_var('HATIDATA_API_KEY') }}"
      database: iceberg_catalog
      schema: analytics
      environment: development
      api_key: "{{ env_var('HATIDATA_API_KEY') }}"
      auto_transpile: true           # Auto-convert Snowflake SQL
      threads: 4
      connect_timeout: 30

    prod:
      type: hatidata
      host: "{{ env_var('HATIDATA_PROD_HOST') }}"
      port: 5439
      user: "{{ env_var('HATIDATA_PROD_USER') }}"
      password: "{{ env_var('HATIDATA_PROD_API_KEY') }}"
      database: iceberg_catalog
      schema: analytics
      environment: production
      api_key: "{{ env_var('HATIDATA_PROD_API_KEY') }}"
      auto_transpile: true
      threads: 8
      connect_timeout: 30

Connection Parameters

Parameter	Required	Default	Description
`type`	Yes	--	Must be `hatidata`
`host`	Yes	--	Proxy hostname
`port`	No	`5439`	Proxy port
`user`	Yes	--	Username
`password`	Yes	--	Password or API key
`database`	Yes	--	Database name
`schema`	Yes	--	Default schema
`environment`	No	`production`	HatiData environment name
`api_key`	No	--	HatiData API key (for agent identification)
`auto_transpile`	No	`true`	Enable Snowflake SQL transpilation
`threads`	No	`4`	Number of parallel threads
`connect_timeout`	No	`30`	Connection timeout in seconds

Running dbt Models

Once configured, use dbt as usual:

# Test connection
dbt debug

# Run all models
dbt run

# Run specific model
dbt run --select my_model

# Test
dbt test

# Generate docs
dbt docs generate
dbt docs serve

Example Model

models/active_customers.sql
{{ config(materialized='table') }}

SELECT
    customer_id,
    NVL(name, 'Unknown') AS customer_name,  -- Snowflake NVL -> COALESCE
    IFF(revenue > 10000, 'enterprise', 'smb') AS tier,  -- Snowflake IFF -> IF
    DATEDIFF(day, first_order, CURRENT_DATE) AS days_since_first_order
FROM {{ ref('raw_customers') }}
WHERE status = 'active'

With auto_transpile: true, HatiData automatically converts Snowflake functions (NVL, IFF, DATEDIFF) to their DuckDB equivalents. See the Function Reference for the complete list of supported function mappings.

Supported Materializations

Materialization	Status
`table`	Fully supported
`view`	Fully supported
`incremental`	Supported (append and merge strategies)
`ephemeral`	Fully supported

Custom Macros

The adapter includes HatiData-specific macro overrides:

Macro	Description
`datediff`	Maps to DuckDB `DATE_DIFF`
`dateadd`	Maps to DuckDB interval arithmetic
`listagg`	Maps to DuckDB `STRING_AGG`
`safe_cast`	Maps to DuckDB `TRY_CAST`
`data_types`	Maps Snowflake types to DuckDB types

These macros ensure that dbt's built-in cross-database macros produce correct DuckDB SQL when running against HatiData.

Migrating from dbt-snowflake

If you are migrating from dbt-snowflake, most models work without changes when auto_transpile is enabled. Follow these steps:

Change type in profiles.yml from snowflake to hatidata
Update connection details -- host, port, credentials
Review unsupported constructs -- COPY INTO, staged file references, STREAMS, TASKS (see SQL Constructs for alternatives)
Test with dbt run -- The transpiler handles function and type mappings automatically

For a comprehensive compatibility check before migrating production models, use Shadow Mode (available in Enterprise deployments). Shadow Mode runs queries against both your existing warehouse and HatiData in parallel, comparing results without affecting production.

Common Migration Patterns

Snowflake-specific functions in models

Models using Snowflake functions like NVL, IFF, DATEDIFF, LISTAGG, and ARRAY_AGG work automatically with auto_transpile: true. No model changes needed.

Semi-structured data access

Models using Snowflake's colon notation for VARIANT columns are transpiled automatically:

models/event_metrics.sql
{{ config(materialized='table') }}

-- Colon notation is automatically rewritten to json_extract_string()
SELECT
    event_id,
    payload:user:name AS user_name,
    payload:action AS action_type
FROM {{ ref('raw_events') }}

Custom materializations

If you have custom materializations that use Snowflake-specific DDL (e.g., CREATE TABLE ... CLONE, ALTER WAREHOUSE), these will need manual conversion.

Incremental Models

HatiData supports dbt incremental models with both append and merge strategies:

models/daily_orders.sql
{{ config(
    materialized='incremental',
    unique_key='order_id'
) }}

SELECT
    order_id,
    customer_id,
    order_date,
    total_amount
FROM {{ ref('raw_orders') }}

{% if is_incremental() %}
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
{% endif %}

Seeds and Sources

dbt seeds and sources work normally with HatiData:

# Load seed data
dbt seed

# Run models that reference sources
dbt run --select source:my_source+

SSL Configuration

For cloud and enterprise deployments, SSL is enabled automatically when the host is not localhost or 127.0.0.1. No additional configuration is needed.

For local development with TLS enabled, set:

    ssl_mode: require

Source Code

GitHub: integrations/dbt

Next Steps

SQL Compatibility -- Full list of supported functions and types
Function Reference -- Detailed function mapping with examples
Local Mode -- Run HatiData locally for development
Cloud Mode -- Connect dbt to HatiData Cloud
Enterprise -- In-VPC deployment for production

Installation​

Configuration​

profiles.yml​

Connection Parameters​

Running dbt Models​

Example Model​

Supported Materializations​

Custom Macros​

Migrating from dbt-snowflake​

Common Migration Patterns​

Snowflake-specific functions in models​

Semi-structured data access​

Custom materializations​

Incremental Models​

Seeds and Sources​

SSL Configuration​

Source Code​

Next Steps​

Stay in the loop