Skip to main content

Function Reference

HatiData automatically transpiles Snowflake functions to their DuckDB equivalents. This page provides the complete function mapping organized by category, with Snowflake syntax, DuckDB output, and usage examples.

Function rewrites are applied during Stage 3 of the transpilation pipeline. The transpiler walks the SQL abstract syntax tree and replaces function calls in SELECT projections, WHERE clauses, HAVING clauses, GROUP BY expressions, and ORDER BY clauses.

Conditional Functions

IFF

Evaluates a condition and returns one of two values.

SnowflakeIFF(condition, true_value, false_value)
DuckDBIF(condition, true_value, false_value)
RewriteDirect function rename
-- Snowflake
SELECT IFF(revenue > 10000, 'enterprise', 'smb') AS tier FROM customers;

-- HatiData output
SELECT IF(revenue > 10000, 'enterprise', 'smb') AS tier FROM customers;

NVL

Returns the first argument if it is not NULL, otherwise returns the second argument.

SnowflakeNVL(expr, default_value)
DuckDBCOALESCE(expr, default_value)
RewriteDirect function rename
-- Snowflake
SELECT NVL(middle_name, '') AS middle_name FROM users;

-- HatiData output
SELECT COALESCE(middle_name, '') AS middle_name FROM users;

NVL2

Returns the second argument if the first is not NULL, otherwise returns the third argument.

SnowflakeNVL2(expr, not_null_value, null_value)
DuckDBCASE WHEN expr IS NOT NULL THEN not_null_value ELSE null_value END
RewriteExpanded to CASE expression
-- Snowflake
SELECT NVL2(phone, 'has phone', 'no phone') AS phone_status FROM contacts;

-- HatiData output
SELECT CASE WHEN phone IS NOT NULL THEN 'has phone' ELSE 'no phone' END AS phone_status FROM contacts;

DECODE

Compares a value against a series of search values and returns the corresponding result. DECODE is not directly rewritten by the transpiler but is supported by DuckDB natively when used in simple forms.

-- Works in both Snowflake and DuckDB
SELECT DECODE(status, 'A', 'Active', 'I', 'Inactive', 'Unknown') FROM users;

COALESCE

Returns the first non-NULL argument. This function is identical in both Snowflake and DuckDB and passes through without modification.

-- Passthrough (same in both)
SELECT COALESCE(preferred_name, first_name, 'Anonymous') FROM users;

String Functions

SPLIT_PART

Splits a string by a delimiter and returns the specified part. This function is identical in both Snowflake and DuckDB.

SnowflakeSPLIT_PART(string, delimiter, part_number)
DuckDBSPLIT_PART(string, delimiter, part_number)
RewritePassthrough (no change)
-- Passthrough (same in both)
SELECT SPLIT_PART(email, '@', 2) AS domain FROM users;

REGEXP_REPLACE

Replaces substrings matching a regular expression. This function is natively supported in DuckDB with compatible syntax.

-- Works in both Snowflake and DuckDB
SELECT REGEXP_REPLACE(phone, '[^0-9]', '') AS digits_only FROM contacts;

Date/Time Functions

DATEDIFF

Calculates the difference between two dates or timestamps in the specified unit.

SnowflakeDATEDIFF(date_part, start_date, end_date)
DuckDBDATE_DIFF(date_part, start_date, end_date)
RewriteDirect function rename (underscore added)
-- Snowflake
SELECT DATEDIFF(day, created_at, CURRENT_DATE) AS days_since_creation FROM orders;

-- HatiData output
SELECT DATE_DIFF(day, created_at, CURRENT_DATE) AS days_since_creation FROM orders;

DATEADD

Adds a specified number of units to a date or timestamp.

SnowflakeDATEADD(date_part, amount, date_expr)
DuckDBdate_expr + INTERVAL 'amount' date_part
RewriteRewritten to interval arithmetic
-- Snowflake
SELECT DATEADD(day, 30, order_date) AS due_date FROM orders;

-- HatiData output
SELECT order_date + INTERVAL '30' DAY AS due_date FROM orders;

TO_TIMESTAMP_NTZ

Converts a value to a timestamp without timezone.

SnowflakeTO_TIMESTAMP_NTZ(expr)
DuckDBCAST(expr AS TIMESTAMP)
RewriteRewritten to CAST expression
-- Snowflake
SELECT TO_TIMESTAMP_NTZ('2024-01-15 10:30:00') AS ts;

-- HatiData output
SELECT CAST('2024-01-15 10:30:00' AS TIMESTAMP) AS ts;

DATE_TRUNC

Truncates a date or timestamp to the specified precision. This function is natively supported in DuckDB with compatible syntax.

-- Works in both Snowflake and DuckDB
SELECT DATE_TRUNC('month', created_at) AS month_start FROM orders;

Aggregate Functions

LISTAGG

Concatenates values from multiple rows into a single string with a delimiter.

SnowflakeLISTAGG(expr, delimiter)
DuckDBSTRING_AGG(expr, delimiter)
RewriteDirect function rename
-- Snowflake
SELECT department, LISTAGG(name, ', ') WITHIN GROUP (ORDER BY name) AS members
FROM employees
GROUP BY department;

-- HatiData output
SELECT department, STRING_AGG(name, ', ' ORDER BY name) AS members
FROM employees
GROUP BY department;
info

In sqlparser v0.45.0, LISTAGG is parsed as a dedicated Expr::ListAgg AST node, not as Expr::Function. The transpiler handles both the AST node form and the function-call form.

ARRAY_AGG

Collects values into an array/list.

SnowflakeARRAY_AGG(expr)
DuckDBLIST(expr)
RewriteDirect function rename
-- Snowflake
SELECT customer_id, ARRAY_AGG(DISTINCT product_id) AS products
FROM orders
GROUP BY customer_id;

-- HatiData output
SELECT customer_id, LIST(DISTINCT product_id) AS products
FROM orders
GROUP BY customer_id;
info

Like LISTAGG, ARRAY_AGG is parsed as a dedicated Expr::ArrayAgg AST node in sqlparser v0.45.0. The transpiler handles both forms.

JSON Functions

PARSE_JSON

Parses a JSON string into a semi-structured value.

SnowflakePARSE_JSON(string_expr)
DuckDBCAST(string_expr AS JSON)
RewriteRewritten to CAST expression
-- Snowflake
SELECT PARSE_JSON('{"name": "Alice", "age": 30}') AS data;

-- HatiData output
SELECT CAST('{"name": "Alice", "age": 30}' AS JSON) AS data;

OBJECT_CONSTRUCT

Constructs a JSON object from key-value pairs.

SnowflakeOBJECT_CONSTRUCT('key1', val1, 'key2', val2, ...)
DuckDBjson_object('key1', val1, 'key2', val2, ...)
RewriteDirect function rename
-- Snowflake
SELECT OBJECT_CONSTRUCT('name', name, 'email', email) AS user_json FROM users;

-- HatiData output
SELECT json_object('name', name, 'email', email) AS user_json FROM users;

GET_PATH

Extracts a value from a semi-structured object using a dot-separated path.

SnowflakeGET_PATH(expr, 'path') or GET(expr, 'key')
DuckDBjson_extract(expr, 'path')
RewriteDirect function rename
-- Snowflake
SELECT GET_PATH(payload, 'user.address.city') AS city FROM events;

-- HatiData output
SELECT json_extract(payload, 'user.address.city') AS city FROM events;

FLATTEN

Explodes an array or object into rows. See SQL Constructs for the full FLATTEN transformation reference.

SnowflakeFLATTEN(input => expr)
DuckDBUNNEST(expr)
RewriteExtracts input parameter, renames to UNNEST
-- Snowflake
SELECT f.value FROM events, TABLE(FLATTEN(input => items)) f;

-- HatiData output
SELECT f.value FROM events, UNNEST(items) AS f;

TO_VARIANT

Converts a value to Snowflake's VARIANT semi-structured type.

SnowflakeTO_VARIANT(expr)
DuckDBCAST(expr AS JSON)
RewriteRewritten to CAST expression
-- Snowflake
SELECT TO_VARIANT(42) AS json_val;

-- HatiData output
SELECT CAST(42 AS JSON) AS json_val;

Type Conversion Functions

TRY_CAST

Attempts to cast a value to a target type, returning NULL on failure instead of raising an error. This function is natively supported in DuckDB with compatible syntax.

-- Works in both Snowflake and DuckDB
SELECT TRY_CAST('not_a_number' AS INTEGER) AS val;
-- Returns NULL instead of error

TO_NUMBER

Converts a string to a numeric value. In DuckDB, use CAST or TRY_CAST:

-- Snowflake
SELECT TO_NUMBER('123.45') AS num;

-- DuckDB equivalent
SELECT CAST('123.45' AS NUMERIC) AS num;

TO_VARCHAR

Converts a value to a string. In DuckDB, use CAST:

-- Snowflake
SELECT TO_VARCHAR(42) AS str;

-- DuckDB equivalent
SELECT CAST(42 AS VARCHAR) AS str;

Function Nesting

The transpiler recursively walks all expression nodes, so nested Snowflake functions are rewritten correctly:

-- Snowflake: nested functions
SELECT IFF(NVL(status, 'unknown') = 'active', DATEDIFF(day, created_at, CURRENT_DATE), 0)
FROM users;

-- HatiData output: all three functions rewritten
SELECT IF(COALESCE(status, 'unknown') = 'active', DATE_DIFF(day, created_at, CURRENT_DATE), 0)
FROM users;

Vector Functions

HatiData extends SQL with vector-aware functions for hybrid search over structured data and agent memory. These functions are resolved at query time by the proxy, which embeds the search text and rewrites the function calls to list_cosine_similarity() for DuckDB execution.

semantic_match

Returns TRUE if the cosine similarity between a column's embedding and the search text exceeds a threshold (default 0.7). Use in WHERE clauses and JOIN ON conditions.

SELECT * FROM support_tickets
WHERE semantic_match(description, 'billing dispute refund request', 0.8);

vector_match

Alias for semantic_match. The two are interchangeable.

SELECT * FROM docs WHERE vector_match(content, 'deployment guide');

semantic_rank

Returns the cosine similarity score as a FLOAT (0.0 to 1.0). Use in SELECT and ORDER BY to rank results by relevance.

SELECT content, semantic_rank(content, 'revenue growth') AS relevance
FROM _hatidata_agent_memory
ORDER BY relevance DESC
LIMIT 10;

For complete syntax, threshold tuning, JOIN_VECTOR, and real-world examples, see the Hybrid Search (Vector SQL) reference.

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.