Loading Data

HatiData accepts data through multiple paths — from raw SQL statements to bulk file imports, SDK methods, and dbt models. Every method goes through the full query pipeline, so access control, audit logging, and metering apply automatically.

SQL Editor (Dashboard)

The fastest way to create tables and insert data is through the SQL Editor in the HatiData Dashboard.

-- Create a table
CREATE TABLE analytics.events (
    id INTEGER,
    event_type VARCHAR,
    payload JSON,
    created_at TIMESTAMP
);

-- Insert rows
INSERT INTO analytics.events VALUES
    (1, 'page_view',  '{"url": "/home"}',     '2025-01-15 10:30:00'),
    (2, 'click',      '{"button": "signup"}',  '2025-01-15 10:31:00'),
    (3, 'page_view',  '{"url": "/pricing"}',   '2025-01-15 10:32:00');

Snowflake-compatible types are transpiled automatically — VARIANT becomes JSON, TIMESTAMP_NTZ becomes TIMESTAMP, NUMBER(38,0) becomes BIGINT, etc. See SQL Functions for the full compatibility reference.

File Imports (Parquet, CSV, JSON)

DuckDB's native file-reading functions let you load data directly from local or remote files:

Parquet

-- Load from a local Parquet file
CREATE TABLE orders AS SELECT * FROM read_parquet('/data/orders.parquet');

-- Load from S3 (when configured)
CREATE TABLE orders AS SELECT * FROM read_parquet('s3://my-bucket/orders/*.parquet');

-- Append to an existing table
INSERT INTO orders SELECT * FROM read_parquet('/data/new_orders.parquet');

CSV

-- Auto-detect schema from CSV
CREATE TABLE customers AS SELECT * FROM read_csv('/data/customers.csv');

-- Specify options
CREATE TABLE customers AS
SELECT * FROM read_csv('/data/customers.csv',
    header = true,
    delim = ',',
    dateformat = '%Y-%m-%d'
);

-- Append to an existing table
INSERT INTO customers SELECT * FROM read_csv('/data/new_customers.csv');

JSON

-- Load newline-delimited JSON
CREATE TABLE logs AS SELECT * FROM read_json('/data/logs.ndjson');

-- Load a JSON array
CREATE TABLE logs AS SELECT * FROM read_json('/data/logs.json', format = 'array');

CREATE TABLE AS SELECT (CTAS)

Transform and load data in a single statement:

-- Derive a summary table from existing data
CREATE TABLE daily_revenue AS
SELECT
    DATE_TRUNC('day', created_at) AS day,
    SUM(amount) AS total_revenue,
    COUNT(*) AS order_count
FROM orders
GROUP BY 1;

-- Combine multiple sources
CREATE TABLE enriched_events AS
SELECT e.*, u.name AS user_name, u.plan
FROM events e
JOIN users u ON e.user_id = u.id;

Python SDK

The Python SDK provides programmatic data loading:

from hatidata_agent import HatiDataAgent

agent = HatiDataAgent(
    host="localhost",
    port=5439,
    agent_id="data-loader",
    framework="custom",
)

# Create a table
agent.query("""
    CREATE TABLE IF NOT EXISTS products (
        id INTEGER,
        name VARCHAR,
        price DECIMAL(10,2),
        category VARCHAR
    )
""")

# Insert rows
agent.query("""
    INSERT INTO products VALUES
        (1, 'Widget A', 9.99,  'hardware'),
        (2, 'Widget B', 19.99, 'hardware'),
        (3, 'Service C', 49.99, 'software')
""")

# Load from a DataFrame (via temporary CSV)
import pandas as pd

df = pd.DataFrame({
    "id": [4, 5, 6],
    "name": ["Widget D", "Widget E", "Service F"],
    "price": [29.99, 39.99, 99.99],
    "category": ["hardware", "hardware", "software"],
})

# Write DataFrame to a temp file, then load
path = "/tmp/products_batch.parquet"
df.to_parquet(path, index=False)
agent.query(f"INSERT INTO products SELECT * FROM read_parquet('{path}')")

TypeScript SDK

The TypeScript SDK uses standard Postgres drivers:

import { HatiDataClient } from "@hatidata/sdk";

const client = new HatiDataClient({
  host: "localhost",
  port: 5439,
  agentId: "data-loader",
});

await client.query(`
  CREATE TABLE IF NOT EXISTS metrics (
    ts TIMESTAMP,
    name VARCHAR,
    value DOUBLE
  )
`);

await client.query(`
  INSERT INTO metrics VALUES
    ('2025-01-15 10:00:00', 'cpu_usage', 42.5),
    ('2025-01-15 10:01:00', 'cpu_usage', 38.2)
`);

CLI

The hati CLI supports local data operations and cloud sync:

# Run SQL directly
hati query "CREATE TABLE users (id INT, name VARCHAR, email VARCHAR)"
hati query "INSERT INTO users VALUES (1, 'Alice', 'alice@example.com')"

# Push local data to cloud
hati push --target cloud

# Pull cloud data to local
hati pull --source cloud

See the Quickstart for a full walkthrough.

dbt Adapter

Use dbt-hatidata to load and transform data with dbt models:

~/.dbt/profiles.yml
hatidata:
  target: dev
  outputs:
    dev:
      type: hatidata
      host: localhost
      port: 5439
      user: admin
      password: "{{ env_var('HATIDATA_API_KEY') }}"
      dbname: hatidata
      schema: analytics
      threads: 4

models/daily_revenue.sql
{{ config(materialized='table') }}

SELECT
    DATE_TRUNC('day', created_at) AS day,
    SUM(amount) AS total_revenue,
    COUNT(*) AS order_count
FROM {{ ref('orders') }}
GROUP BY 1

dbt run --models daily_revenue

psql / Any Postgres Client

Since HatiData speaks the Postgres wire protocol, any Postgres-compatible client works:

# psql
psql -h localhost -p 5439 -U admin -d hatidata

# Then run any SQL
CREATE TABLE test (id INT, value TEXT);
INSERT INTO test VALUES (1, 'hello'), (2, 'world');
\copy test FROM '/data/test.csv' WITH CSV HEADER

This includes GUI tools like DBeaver, DataGrip, pgAdmin, Metabase, and Grafana. See Postgres Drivers for connection details.

Next Steps

SQL Functions — Full SQL compatibility reference
Python SDK — Agent-aware Python client
dbt Adapter — Transform data with dbt models
Quickstart — End-to-end setup walkthrough

SQL Editor (Dashboard)​

File Imports (Parquet, CSV, JSON)​

Parquet​

CSV​

JSON​

CREATE TABLE AS SELECT (CTAS)​

Python SDK​

TypeScript SDK​

CLI​

dbt Adapter​

psql / Any Postgres Client​

Next Steps​

Stay in the loop