Function Reference
HatiData automatically transpiles Snowflake functions to their DuckDB equivalents. This page provides the complete function mapping organized by category, with Snowflake syntax, DuckDB output, and usage examples.
Function rewrites are applied during Stage 3 of the transpilation pipeline. The transpiler walks the SQL abstract syntax tree and replaces function calls in SELECT projections, WHERE clauses, HAVING clauses, GROUP BY expressions, and ORDER BY clauses.
Conditional Functions
IFF
Evaluates a condition and returns one of two values.
| Snowflake | IFF(condition, true_value, false_value) |
| DuckDB | IF(condition, true_value, false_value) |
| Rewrite | Direct function rename |
-- Snowflake
SELECT IFF(revenue > 10000, 'enterprise', 'smb') AS tier FROM customers;
-- HatiData output
SELECT IF(revenue > 10000, 'enterprise', 'smb') AS tier FROM customers;
NVL
Returns the first argument if it is not NULL, otherwise returns the second argument.
| Snowflake | NVL(expr, default_value) |
| DuckDB | COALESCE(expr, default_value) |
| Rewrite | Direct function rename |
-- Snowflake
SELECT NVL(middle_name, '') AS middle_name FROM users;
-- HatiData output
SELECT COALESCE(middle_name, '') AS middle_name FROM users;
NVL2
Returns the second argument if the first is not NULL, otherwise returns the third argument.
| Snowflake | NVL2(expr, not_null_value, null_value) |
| DuckDB | CASE WHEN expr IS NOT NULL THEN not_null_value ELSE null_value END |
| Rewrite | Expanded to CASE expression |
-- Snowflake
SELECT NVL2(phone, 'has phone', 'no phone') AS phone_status FROM contacts;
-- HatiData output
SELECT CASE WHEN phone IS NOT NULL THEN 'has phone' ELSE 'no phone' END AS phone_status FROM contacts;
DECODE
Compares a value against a series of search values and returns the corresponding result. DECODE is not directly rewritten by the transpiler but is supported by DuckDB natively when used in simple forms.
-- Works in both Snowflake and DuckDB
SELECT DECODE(status, 'A', 'Active', 'I', 'Inactive', 'Unknown') FROM users;
COALESCE
Returns the first non-NULL argument. This function is identical in both Snowflake and DuckDB and passes through without modification.
-- Passthrough (same in both)
SELECT COALESCE(preferred_name, first_name, 'Anonymous') FROM users;
String Functions
SPLIT_PART
Splits a string by a delimiter and returns the specified part. This function is identical in both Snowflake and DuckDB.
| Snowflake | SPLIT_PART(string, delimiter, part_number) |
| DuckDB | SPLIT_PART(string, delimiter, part_number) |
| Rewrite | Passthrough (no change) |
-- Passthrough (same in both)
SELECT SPLIT_PART(email, '@', 2) AS domain FROM users;
REGEXP_REPLACE
Replaces substrings matching a regular expression. This function is natively supported in DuckDB with compatible syntax.
-- Works in both Snowflake and DuckDB
SELECT REGEXP_REPLACE(phone, '[^0-9]', '') AS digits_only FROM contacts;
Date/Time Functions
DATEDIFF
Calculates the difference between two dates or timestamps in the specified unit.
| Snowflake | DATEDIFF(date_part, start_date, end_date) |
| DuckDB | DATE_DIFF(date_part, start_date, end_date) |
| Rewrite | Direct function rename (underscore added) |
-- Snowflake
SELECT DATEDIFF(day, created_at, CURRENT_DATE) AS days_since_creation FROM orders;
-- HatiData output
SELECT DATE_DIFF(day, created_at, CURRENT_DATE) AS days_since_creation FROM orders;
DATEADD
Adds a specified number of units to a date or timestamp.
| Snowflake | DATEADD(date_part, amount, date_expr) |
| DuckDB | date_expr + INTERVAL 'amount' date_part |
| Rewrite | Rewritten to interval arithmetic |
-- Snowflake
SELECT DATEADD(day, 30, order_date) AS due_date FROM orders;
-- HatiData output
SELECT order_date + INTERVAL '30' DAY AS due_date FROM orders;
TO_TIMESTAMP_NTZ
Converts a value to a timestamp without timezone.
| Snowflake | TO_TIMESTAMP_NTZ(expr) |
| DuckDB | CAST(expr AS TIMESTAMP) |
| Rewrite | Rewritten to CAST expression |
-- Snowflake
SELECT TO_TIMESTAMP_NTZ('2024-01-15 10:30:00') AS ts;
-- HatiData output
SELECT CAST('2024-01-15 10:30:00' AS TIMESTAMP) AS ts;
DATE_TRUNC
Truncates a date or timestamp to the specified precision. This function is natively supported in DuckDB with compatible syntax.
-- Works in both Snowflake and DuckDB
SELECT DATE_TRUNC('month', created_at) AS month_start FROM orders;
Aggregate Functions
LISTAGG
Concatenates values from multiple rows into a single string with a delimiter.
| Snowflake | LISTAGG(expr, delimiter) |
| DuckDB | STRING_AGG(expr, delimiter) |
| Rewrite | Direct function rename |
-- Snowflake
SELECT department, LISTAGG(name, ', ') WITHIN GROUP (ORDER BY name) AS members
FROM employees
GROUP BY department;
-- HatiData output
SELECT department, STRING_AGG(name, ', ' ORDER BY name) AS members
FROM employees
GROUP BY department;
In sqlparser v0.45.0, LISTAGG is parsed as a dedicated Expr::ListAgg AST node, not as Expr::Function. The transpiler handles both the AST node form and the function-call form.
ARRAY_AGG
Collects values into an array/list.
| Snowflake | ARRAY_AGG(expr) |
| DuckDB | LIST(expr) |
| Rewrite | Direct function rename |
-- Snowflake
SELECT customer_id, ARRAY_AGG(DISTINCT product_id) AS products
FROM orders
GROUP BY customer_id;
-- HatiData output
SELECT customer_id, LIST(DISTINCT product_id) AS products
FROM orders
GROUP BY customer_id;
Like LISTAGG, ARRAY_AGG is parsed as a dedicated Expr::ArrayAgg AST node in sqlparser v0.45.0. The transpiler handles both forms.
JSON Functions
PARSE_JSON
Parses a JSON string into a semi-structured value.
| Snowflake | PARSE_JSON(string_expr) |
| DuckDB | CAST(string_expr AS JSON) |
| Rewrite | Rewritten to CAST expression |
-- Snowflake
SELECT PARSE_JSON('{"name": "Alice", "age": 30}') AS data;
-- HatiData output
SELECT CAST('{"name": "Alice", "age": 30}' AS JSON) AS data;
OBJECT_CONSTRUCT
Constructs a JSON object from key-value pairs.
| Snowflake | OBJECT_CONSTRUCT('key1', val1, 'key2', val2, ...) |
| DuckDB | json_object('key1', val1, 'key2', val2, ...) |
| Rewrite | Direct function rename |
-- Snowflake
SELECT OBJECT_CONSTRUCT('name', name, 'email', email) AS user_json FROM users;
-- HatiData output
SELECT json_object('name', name, 'email', email) AS user_json FROM users;
GET_PATH
Extracts a value from a semi-structured object using a dot-separated path.
| Snowflake | GET_PATH(expr, 'path') or GET(expr, 'key') |
| DuckDB | json_extract(expr, 'path') |
| Rewrite | Direct function rename |
-- Snowflake
SELECT GET_PATH(payload, 'user.address.city') AS city FROM events;
-- HatiData output
SELECT json_extract(payload, 'user.address.city') AS city FROM events;
FLATTEN
Explodes an array or object into rows. See SQL Constructs for the full FLATTEN transformation reference.
| Snowflake | FLATTEN(input => expr) |
| DuckDB | UNNEST(expr) |
| Rewrite | Extracts input parameter, renames to UNNEST |
-- Snowflake
SELECT f.value FROM events, TABLE(FLATTEN(input => items)) f;
-- HatiData output
SELECT f.value FROM events, UNNEST(items) AS f;
TO_VARIANT
Converts a value to Snowflake's VARIANT semi-structured type.
| Snowflake | TO_VARIANT(expr) |
| DuckDB | CAST(expr AS JSON) |
| Rewrite | Rewritten to CAST expression |
-- Snowflake
SELECT TO_VARIANT(42) AS json_val;
-- HatiData output
SELECT CAST(42 AS JSON) AS json_val;
Type Conversion Functions
TRY_CAST
Attempts to cast a value to a target type, returning NULL on failure instead of raising an error. This function is natively supported in DuckDB with compatible syntax.
-- Works in both Snowflake and DuckDB
SELECT TRY_CAST('not_a_number' AS INTEGER) AS val;
-- Returns NULL instead of error
TO_NUMBER
Converts a string to a numeric value. In DuckDB, use CAST or TRY_CAST:
-- Snowflake
SELECT TO_NUMBER('123.45') AS num;
-- DuckDB equivalent
SELECT CAST('123.45' AS NUMERIC) AS num;
TO_VARCHAR
Converts a value to a string. In DuckDB, use CAST:
-- Snowflake
SELECT TO_VARCHAR(42) AS str;
-- DuckDB equivalent
SELECT CAST(42 AS VARCHAR) AS str;
Function Nesting
The transpiler recursively walks all expression nodes, so nested Snowflake functions are rewritten correctly:
-- Snowflake: nested functions
SELECT IFF(NVL(status, 'unknown') = 'active', DATEDIFF(day, created_at, CURRENT_DATE), 0)
FROM users;
-- HatiData output: all three functions rewritten
SELECT IF(COALESCE(status, 'unknown') = 'active', DATE_DIFF(day, created_at, CURRENT_DATE), 0)
FROM users;
Vector Functions
HatiData extends SQL with vector-aware functions for hybrid search over structured data and agent memory. These functions are resolved at query time by the proxy, which embeds the search text and rewrites the function calls to list_cosine_similarity() for DuckDB execution.
semantic_match
Returns TRUE if the cosine similarity between a column's embedding and the search text exceeds a threshold (default 0.7). Use in WHERE clauses and JOIN ON conditions.
SELECT * FROM support_tickets
WHERE semantic_match(description, 'billing dispute refund request', 0.8);
vector_match
Alias for semantic_match. The two are interchangeable.
SELECT * FROM docs WHERE vector_match(content, 'deployment guide');
semantic_rank
Returns the cosine similarity score as a FLOAT (0.0 to 1.0). Use in SELECT and ORDER BY to rank results by relevance.
SELECT content, semantic_rank(content, 'revenue growth') AS relevance
FROM _hatidata_agent_memory
ORDER BY relevance DESC
LIMIT 10;
For complete syntax, threshold tuning, JOIN_VECTOR, and real-world examples, see the Hybrid Search (Vector SQL) reference.
Next Steps
- Hybrid Search (Vector SQL) -- Full vector SQL reference
- SQL Compatibility -- Transpilation pipeline overview
- Data Types -- Complete type mapping reference
- SQL Constructs -- Snowflake construct transformations