Data Types
HatiData automatically maps Snowflake data types to their DuckDB equivalents during transpilation. Type rewrites are applied in Stage 4 of the transpilation pipeline and affect CREATE TABLE column definitions, ALTER TABLE ADD COLUMN statements, and CAST expressions.
Type Mapping Table
Rewritten Types
These Snowflake-specific types are automatically converted to DuckDB equivalents:
| Snowflake Type | DuckDB Type | Category | Description |
|---|---|---|---|
VARIANT | JSON | Semi-structured | Snowflake's universal semi-structured type. Holds any JSON-compatible value (object, array, scalar). Mapped to DuckDB's native JSON type. |
OBJECT | JSON | Semi-structured | Snowflake's key-value object type. Represents a JSON object with string keys and VARIANT values. Mapped to JSON since DuckDB handles JSON objects natively. |
ARRAY (untyped) | JSON | Semi-structured | Snowflake's untyped array. Contains VARIANT elements. Mapped to JSON because DuckDB's typed LIST requires a known element type. Typed arrays (e.g., ARRAY(INTEGER)) are preserved as-is. |
NUMBER | NUMERIC | Numeric | Snowflake's arbitrary-precision decimal type. NUMBER(p, s) precision and scale parameters are mapped to NUMERIC without explicit precision (DuckDB uses arbitrary precision by default). |
TIMESTAMP_NTZ | TIMESTAMP | Date/Time | Timestamp without timezone. The "_NTZ" (no timezone) suffix is Snowflake-specific; DuckDB's TIMESTAMP is equivalent. |
TIMESTAMP_LTZ | TIMESTAMPTZ | Date/Time | Timestamp with local timezone. The "_LTZ" (local timezone) suffix is Snowflake-specific; DuckDB's TIMESTAMPTZ stores timezone-aware timestamps. |
Passthrough Types
These types are identical in both Snowflake and DuckDB and pass through without modification:
| Type | Category | Description |
|---|---|---|
INTEGER | Numeric | 32-bit signed integer. Also accepts aliases INT and INT4. |
BIGINT | Numeric | 64-bit signed integer. Also accepts alias INT8. |
SMALLINT | Numeric | 16-bit signed integer. Also accepts alias INT2. |
TINYINT | Numeric | 8-bit signed integer. Also accepts alias INT1. |
FLOAT | Numeric | 32-bit IEEE 754 floating-point number. Also accepts FLOAT4. |
DOUBLE | Numeric | 64-bit IEEE 754 floating-point number. Also accepts FLOAT8 and DOUBLE PRECISION. |
DECIMAL | Numeric | Fixed-point decimal with specified precision and scale. DECIMAL(p, s) syntax is identical. |
VARCHAR | String | Variable-length character string. VARCHAR(n) length constraints are preserved. |
CHAR | String | Fixed-length character string. |
TEXT | String | Alias for VARCHAR in both engines. |
BOOLEAN | Boolean | True/false value. |
DATE | Date/Time | Calendar date (year, month, day) without time component. |
TIME | Date/Time | Time of day without date or timezone. |
TIMESTAMP | Date/Time | Timestamp without timezone (equivalent to TIMESTAMP_NTZ). |
BLOB | Binary | Binary large object. Also accepts BYTEA. |
Type Rewrites in CREATE TABLE
When a CREATE TABLE statement contains Snowflake-specific types, HatiData rewrites the column definitions:
-- Snowflake
CREATE TABLE events (
id INTEGER,
name VARCHAR(255),
payload VARIANT,
metadata OBJECT,
tags ARRAY,
amount NUMBER,
created_at TIMESTAMP_NTZ,
updated_at TIMESTAMP_LTZ,
active BOOLEAN
);
-- HatiData output (DuckDB)
CREATE TABLE events (
id INTEGER,
name VARCHAR(255),
payload JSON,
metadata JSON,
tags JSON,
amount NUMERIC,
created_at TIMESTAMP,
updated_at TIMESTAMPTZ,
active BOOLEAN
);
Type Rewrites in CAST Expressions
CAST expressions within SELECT queries are also rewritten:
-- Snowflake
SELECT
CAST(raw_data AS VARIANT) AS data,
CAST(config AS OBJECT) AS config_json,
CAST(created AS TIMESTAMP_NTZ) AS created_ts
FROM staging;
-- HatiData output (DuckDB)
SELECT
CAST(raw_data AS JSON) AS data,
CAST(config AS JSON) AS config_json,
CAST(created AS TIMESTAMP) AS created_ts
FROM staging;
Type Rewrites in ALTER TABLE
Adding columns with Snowflake types is rewritten as well:
-- Snowflake
ALTER TABLE events ADD COLUMN properties VARIANT;
-- HatiData output (DuckDB)
ALTER TABLE events ADD COLUMN properties JSON;
Precision Handling
NUMBER / NUMERIC
Snowflake's NUMBER(p, s) supports precision up to 38 digits and scale from 0 to the specified precision. HatiData maps NUMBER to DuckDB's NUMERIC, which provides arbitrary precision. Note that explicit NUMBER(p, s) precision/scale parameters are simplified to NUMERIC without parameters during transpilation. If your application depends on exact precision enforcement, validate the behavior with your specific queries.
-- Snowflake
CREATE TABLE financials (amount NUMBER(18, 2));
-- HatiData output
CREATE TABLE financials (amount NUMERIC);
If you need exact precision in DuckDB, use DECIMAL(p, s) directly in your schema definitions, which passes through without modification.
Timestamp Precision
Snowflake supports optional fractional-second precision for timestamps (e.g., TIMESTAMP_NTZ(9) for nanoseconds). DuckDB's TIMESTAMP provides microsecond precision by default. The precision parameter is preserved where the source SQL specifies a plain TIMESTAMP with precision, but the TIMESTAMP_NTZ to TIMESTAMP mapping does not carry over custom precision.
VARCHAR Length
VARCHAR(n) length constraints are preserved during transpilation. DuckDB supports VARCHAR with optional length limits, matching Snowflake's behavior.
Semi-Structured Data
The three semi-structured types (VARIANT, OBJECT, ARRAY) all map to DuckDB's JSON type. This means:
- Querying: Use DuckDB's JSON functions (
json_extract,json_extract_string,json_type,json_array_length) or the arrow operators (->,->>) to access nested data. - Colon notation: Snowflake's
col:fieldsyntax is automatically rewritten tojson_extract_string(col, 'field')in Stage 1 of the pipeline. - Insertion: Insert JSON strings directly. DuckDB validates JSON syntax on insertion into
JSONcolumns.
-- Insert semi-structured data
INSERT INTO events (payload) VALUES ('{"user": "alice", "action": "login"}');
-- Query with DuckDB JSON functions
SELECT json_extract_string(payload, 'user') AS user_name FROM events;
-- Or use arrow operator
SELECT payload->>'user' AS user_name FROM events;
Edge Cases
Typed Arrays
Snowflake's typed arrays (e.g., ARRAY(INTEGER)) are distinct from untyped ARRAY. The transpiler only converts untyped ARRAY to JSON. Typed arrays with an inner type definition are left as-is, relying on DuckDB's native LIST type support.
NULL Handling
Both Snowflake and DuckDB follow SQL-standard NULL semantics. Type mapping does not alter NULL behavior -- NULL values in any column type remain NULL after transpilation.
Custom Types
Any Snowflake type not listed in the mapping table (e.g., GEOGRAPHY, GEOMETRY) is left as-is. If DuckDB does not support the type, execution will fail with a DuckDB error rather than a transpilation error.
Next Steps
- SQL Compatibility -- Transpilation pipeline overview
- Function Reference -- Function mapping with examples
- SQL Constructs -- Snowflake construct transformations