Skip to main content

SQL Constructs

HatiData transforms Snowflake-specific SQL constructs to their DuckDB equivalents during Stage 5 of the transpilation pipeline. These are structural transformations that go beyond simple function or type renaming -- they involve rewriting parts of the AST or rejecting constructs that have no DuckDB equivalent.

FLATTEN (to UNNEST)

Snowflake's FLATTEN explodes an array or object into multiple rows. HatiData rewrites this to DuckDB's UNNEST.

Basic Usage

-- Snowflake
SELECT value
FROM TABLE(FLATTEN(input => ARRAY_CONSTRUCT(1, 2, 3)));

-- HatiData output (DuckDB)
SELECT value
FROM UNNEST(ARRAY_CONSTRUCT(1, 2, 3));

With a Table Join

-- Snowflake
SELECT
e.event_id,
f.value AS item
FROM events e,
TABLE(FLATTEN(input => e.items)) f;

-- HatiData output (DuckDB)
SELECT
e.event_id,
f.value AS item
FROM events e,
UNNEST(e.items) AS f;

With JSON Data

When combined with colon notation (which is rewritten in Stage 1), the full transformation looks like this:

-- Snowflake
SELECT
e.event_id,
f.value:name::VARCHAR AS item_name
FROM events e,
TABLE(FLATTEN(input => e.payload:items)) f;

-- HatiData output (DuckDB)
SELECT
e.event_id,
CAST(json_extract_string(f.value, 'name') AS VARCHAR) AS item_name
FROM events e,
UNNEST(json_extract(e.payload, 'items')) AS f;

FLATTEN Parameters

Snowflake's FLATTEN supports these named parameters:

ParameterSnowflakeHatiData Behavior
inputRequired array/objectExtracted and passed to UNNEST
pathDot-separated path into nested structureNot directly supported; pre-extract with GET_PATH
outerInclude rows with empty arrays (boolean)Not supported; use LEFT JOIN UNNEST instead
recursiveRecursively flatten nested structuresNot supported; flatten each level explicitly
modeOBJECT, ARRAY, or BOTHNot supported; DuckDB unnests arrays by default

For complex FLATTEN usage with path, outer, recursive, or mode parameters, you may need to restructure the query manually.

QUALIFY (to Subquery)

Snowflake's QUALIFY clause filters the result of window functions without needing a subquery. DuckDB does support QUALIFY natively, so this construct passes through in most cases.

-- Works in both Snowflake and DuckDB
SELECT
customer_id,
order_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
FROM orders
QUALIFY rn = 1;

If you encounter compatibility issues with a specific QUALIFY usage, the manual equivalent is a subquery:

-- Manual subquery equivalent
SELECT * FROM (
SELECT
customer_id,
order_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
FROM orders
) sub
WHERE rn = 1;

LATERAL Joins

Snowflake's LATERAL keyword allows subqueries in the FROM clause to reference columns from preceding table expressions. DuckDB supports LATERAL joins natively, so these pass through without modification.

-- Works in both Snowflake and DuckDB
SELECT
c.customer_id,
c.name,
recent.order_date,
recent.total
FROM customers c,
LATERAL (
SELECT order_date, total
FROM orders o
WHERE o.customer_id = c.customer_id
ORDER BY order_date DESC
LIMIT 3
) recent;

OBJECT_CONSTRUCT (to json_object)

Snowflake's OBJECT_CONSTRUCT creates a JSON object from alternating key-value pairs. HatiData renames this to DuckDB's json_object.

-- Snowflake
SELECT OBJECT_CONSTRUCT(
'name', name,
'email', email,
'created', created_at
) AS user_json
FROM users;

-- HatiData output (DuckDB)
SELECT json_object(
'name', name,
'email', email,
'created', created_at
) AS user_json
FROM users;

Nested Objects

-- Snowflake
SELECT OBJECT_CONSTRUCT(
'user', OBJECT_CONSTRUCT('name', name, 'id', id),
'metadata', OBJECT_CONSTRUCT('source', 'api', 'version', 2)
) AS payload
FROM users;

-- HatiData output (DuckDB)
SELECT json_object(
'user', json_object('name', name, 'id', id),
'metadata', json_object('source', 'api', 'version', 2)
) AS payload
FROM users;

Colon Notation (to json_extract_string)

Snowflake's colon notation provides concise access to semi-structured data fields. This is rewritten in Stage 1 (pre-parse) before the SQL parser runs, since sqlparser does not understand this syntax.

Single-Level Access

-- Snowflake
SELECT payload:action FROM events;

-- HatiData output (DuckDB)
SELECT json_extract_string(payload, 'action') FROM events;

Multi-Level Access

-- Snowflake
SELECT payload:user:address:city FROM events;

-- HatiData output (DuckDB)
SELECT json_extract_string(json_extract_string(json_extract_string(payload, 'user'), 'address'), 'city') FROM events;

With Type Casting

Snowflake's :: cast operator is a standard PostgreSQL feature and passes through:

-- Snowflake
SELECT payload:amount::FLOAT AS amount FROM events;

-- HatiData output (DuckDB)
SELECT CAST(json_extract_string(payload, 'amount') AS FLOAT) AS amount FROM events;
caution

The pre-parser uses regex matching to distinguish single colons (JSON access) from double colons (type casting). In rare cases involving complex string literals containing colons, the pre-parser may produce unexpected results. If you encounter issues, use json_extract_string() explicitly instead of colon notation.

GET_PATH (to json_extract)

Snowflake's GET_PATH (and its alias GET) extracts values from semi-structured data using a path expression.

-- Snowflake
SELECT GET_PATH(payload, 'user.preferences.theme') AS theme FROM events;

-- HatiData output (DuckDB)
SELECT json_extract(payload, 'user.preferences.theme') AS theme FROM events;
note

DuckDB's json_extract uses a different path syntax than Snowflake's GET_PATH. Snowflake uses dot notation ('a.b.c'), while DuckDB's json_extract uses JSON Pointer or key paths ('$.a.b.c'). Simple dot-separated paths work in both, but complex paths with array indexing may need manual adjustment.

CLONE (to CREATE TABLE AS SELECT)

Snowflake's CLONE creates a zero-copy snapshot of a table, schema, or database. DuckDB does not support zero-copy cloning. HatiData detects CLONE statements and logs a warning; manual conversion to CREATE TABLE AS SELECT is recommended.

-- Snowflake
CREATE TABLE orders_backup CLONE orders;

-- Manual equivalent in HatiData (DuckDB)
CREATE TABLE orders_backup AS SELECT * FROM orders;
warning

Unlike Snowflake's CLONE, CREATE TABLE AS SELECT creates a full copy of the data. For large tables, this will consume additional storage and take longer to execute.

SAMPLE / TABLESAMPLE

Both Snowflake and DuckDB support TABLESAMPLE for returning a random subset of rows, though with slightly different syntax options.

-- Snowflake (percentage-based)
SELECT * FROM orders SAMPLE (10);

-- DuckDB equivalent
SELECT * FROM orders TABLESAMPLE 10 PERCENT;
-- Snowflake (row-count-based)
SELECT * FROM orders SAMPLE (100 ROWS);

-- DuckDB equivalent
SELECT * FROM orders USING SAMPLE 100 ROWS;
note

Snowflake's SAMPLE and DuckDB's TABLESAMPLE / USING SAMPLE produce statistically random subsets, but the exact rows returned will differ between engines even with the same seed, due to different sampling algorithms.

Unsupported Constructs

The following Snowflake-specific constructs are rejected during transpilation with descriptive error messages. The transpiler returns a TranspilerError::UnsupportedConstruct error rather than producing incorrect SQL.

COPY INTO

Snowflake's bulk data loading command has no direct DuckDB equivalent.

-- Snowflake (NOT SUPPORTED)
COPY INTO my_table FROM @my_stage/data.csv FILE_FORMAT = (TYPE = 'CSV');

-- Error: "COPY INTO is a Snowflake-specific data loading command..."

Alternative: Use DuckDB's native file-reading functions:

-- DuckDB equivalents
COPY my_table FROM 'data.csv' (HEADER, DELIMITER ',');
INSERT INTO my_table SELECT * FROM read_csv('data.csv', header=true);
INSERT INTO my_table SELECT * FROM read_parquet('data.parquet');

Staged File References

Snowflake stage references (@stage_name, $1, $2) are not supported.

-- Snowflake (NOT SUPPORTED)
SELECT $1, $2 FROM @my_stage/file.csv;

-- Error: "Snowflake staged file references (e.g., @stage_name, $1, $2) are not supported..."

Alternative: Use DuckDB's file-reading functions:

-- DuckDB equivalents
SELECT * FROM read_csv('file.csv');
SELECT * FROM read_parquet('file.parquet');
SELECT * FROM read_json('file.json');

STREAMS

Snowflake Streams provide change data capture (CDC) functionality. This is a Snowflake-specific feature with no DuckDB equivalent.

Alternative: Use HatiData's audit log to track data changes, or implement CDC at the application layer.

TASKS

Snowflake Tasks provide scheduled SQL execution. This is a Snowflake-specific feature with no DuckDB equivalent.

Alternative: Use external schedulers such as cron, Apache Airflow, or Prefect to run scheduled queries against HatiData.

PIPES

Snowflake Pipes provide continuous data loading from stages. This is a Snowflake-specific feature with no DuckDB equivalent.

Alternative: Use INSERT INTO ... SELECT from DuckDB file-reading functions, triggered by an external scheduler or event-driven pipeline.

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.