Skip to main content

SQL Compatibility Guide

HatiData exposes a PostgreSQL v3 wire protocol on port 5439, backed by DuckDB as the execution engine. This guide documents exactly what is supported, what is not, and what behaves differently from a standard PostgreSQL server.


Supported Data Types

HatiData supports the following data types through its PostgreSQL wire protocol. Types are mapped from Postgres conventions to DuckDB's internal representation.

Postgres TypeDuckDB EquivalentNotes
INTEGER / INT / INT4INTEGER32-bit signed integer
BIGINT / INT8BIGINT64-bit signed integer
SMALLINT / INT2SMALLINT16-bit signed integer
BOOLEANBOOLEANtrue / false
REAL / FLOAT4FLOAT32-bit IEEE 754 float
DOUBLE PRECISION / FLOAT8DOUBLE64-bit IEEE 754 float
NUMERIC / DECIMALNUMERIC / DECIMALArbitrary precision. DECIMAL(p,s) preserves precision/scale.
TEXTVARCHARUnlimited-length string
VARCHAR(n)VARCHAR(n)Length constraint preserved
CHAR(n)VARCHARFixed-width char (stored as VARCHAR internally)
UUIDUUID128-bit universally unique identifier
DATEDATECalendar date (YYYY-MM-DD)
TIMESTAMPTIMESTAMPMicrosecond precision, no timezone
TIMESTAMPTZTIMESTAMPTZMicrosecond precision, stored as UTC
INTERVALINTERVALDuration type
JSONJSONJSON text storage
JSONBJSONStored identically to JSON (no separate binary format)
BYTEABLOBBinary data
INTEGER[]LIST(INTEGER)Array of integers (Postgres array text format)
TEXT[]LIST(VARCHAR)Array of strings
FLOAT4[]LIST(FLOAT)Array of 32-bit floats
FLOAT8[]LIST(DOUBLE)Array of 64-bit floats
BIGINT[]LIST(BIGINT)Array of 64-bit integers

Unsupported Types

These Postgres types are not supported. Use the recommended alternatives:

Postgres TypeAlternativeReason
SERIAL / BIGSERIALINTEGER + application-generated UUIDsDuckDB has no auto-increment sequences. The proxy returns a clear error if SERIAL is used in CREATE TABLE.
MONEYNUMERIC(12,2)Not a standard SQL type; use explicit decimal precision.
CIDR / INETTEXTNetwork address types are Postgres-specific. Store as text and parse in application code.
MACADDRTEXTHardware address type is Postgres-specific.
XMLTEXT or JSONXML type is not supported by DuckDB.
HSTOREJSONKey-value type is Postgres-specific. JSON provides the same capability.
TSVECTOR / TSQUERYTEXT + semantic_match()Full-text search types are Postgres-specific. Use HatiData's vector search functions instead.
Custom domains / compositesNot supportedDuckDB does not support user-defined composite types.

Query Protocol Support

HatiData implements the full PostgreSQL v3 wire protocol, supporting both query modes.

Simple Query Protocol

Status: Fully supported

Used by psql, simple scripts, and most BI tools. Sends SQL as text and receives text-formatted results.

Extended Query Protocol (Prepared Statements)

Status: Supported with limitations

Used by application drivers (node-postgres, psycopg2, SQLAlchemy, tokio-postgres, sqlx, asyncpg). Supports the full Parse/Bind/Describe/Execute message flow.

Binary Parameter Binding

Binary-format parameters in the Bind message are supported for:

TypeOIDBinary Support
BOOL16Supported
INT221Supported
INT423Supported
INT820Supported
FLOAT4700Supported
FLOAT8701Supported
TEXT25Supported
VARCHAR1043Supported
BYTEA17Supported
JSON114Supported
JSONB3802Supported
DATE1082Supported
TIMESTAMP1114Supported
TIMESTAMPTZ1184Supported
UUID2950Supported

Binary Result Columns

The same 14 types listed above support binary encoding in result columns. When a client requests binary format via the Bind message's result format codes, HatiData encodes values in Postgres-compatible binary format.

Types that fall back to text format:

TypeOIDReason
NUMERIC1700Postgres NUMERIC binary format is BCD-encoded with a complex header structure. Text fallback avoids encoding errors.
INTERVAL1186Binary interval encoding not yet implemented.
Array types1007, 1009, etc.Binary array encoding not yet implemented.

The fallback is spec-compliant: the Describe response advertises format=0 (text) for unsupported types, so clients always know what to expect. All major drivers (tokio-postgres, sqlx, asyncpg, node-postgres) handle this correctly.

tip

If you encounter issues with the extended query protocol, most drivers allow forcing simple query mode. For example, in psycopg2, use cursor.execute() with %s parameters instead of server-side prepared statements.


Supported SQL Features

FeatureStatus
SELECT (with subqueries, aliases, expressions)Fully supported
INSERT INTO ... VALUESFully supported
INSERT INTO ... SELECTFully supported
UPDATEFully supported
DELETEFully supported
CREATE TABLE (with type mapping)Fully supported
CREATE TABLE AS SELECTFully supported
ALTER TABLE ADD COLUMN (with type mapping)Fully supported
DROP TABLEFully supported
CREATE VIEWFully supported
JOIN (INNER, LEFT, RIGHT, FULL, CROSS)Fully supported
LATERAL joinsFully supported
Subqueries (scalar, correlated, IN/EXISTS)Fully supported
Window functions (ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, NTILE, SUM OVER, etc.)Fully supported
QUALIFY clauseFully supported (native DuckDB)
Common Table Expressions (WITH)Fully supported
UNION / INTERSECT / EXCEPTFully supported
GROUP BY, HAVINGFully supported
ORDER BY, LIMIT, OFFSETFully supported
CASE WHEN ... THEN ... ELSE ... ENDFully supported
CAST(expr AS type) / ::typeFully supported
LIKE, ILIKEFully supported
BETWEEN, IN, EXISTSFully supported
EXPLAINFully supported
TABLESAMPLE / SAMPLESupported (syntax may differ slightly)

Unsupported SQL Features

FeatureError MessageAlternative
SERIAL / BIGSERIAL / CREATE SEQUENCE"SERIAL/SEQUENCE types are not supported. Use INTEGER with application-generated IDs (UUID recommended)."Use INTEGER or BIGINT columns with application-generated UUIDs or IDs.
COPY IN / COPY OUT"COPY is not supported via the wire protocol. Use INSERT...SELECT for bulk data loading, or the HatiData Python/TypeScript SDK for file imports."Use INSERT...SELECT, the SDK, or DuckDB's read_csv() / read_parquet() functions.
DECLARE CURSOR / FETCH / CLOSE"Server-side cursors (DECLARE/FETCH/CLOSE) are not supported. Use LIMIT/OFFSET for pagination."Use LIMIT / OFFSET for pagination.
LISTEN / NOTIFY"LISTEN/NOTIFY is not supported. Use the HatiData Semantic Triggers feature for event-driven workflows."Use Semantic Triggers for event-driven agent workflows.
BEGIN / COMMIT / ROLLBACKAccepted (no error), but a NOTICE is sent on BEGIN: "HatiData operates in auto-commit mode. BEGIN/COMMIT/ROLLBACK are accepted for compatibility but provide no transaction isolation."Each statement auto-commits. Design for idempotent operations.
SAVEPOINTNot supportedNot available in auto-commit mode.
Stored procedures (CREATE FUNCTION)Not supportedImplement logic in application code or use the MCP server's tool pipeline.
CREATE TRIGGERNot supportedUse Semantic Triggers for event-driven workflows.
FOREIGN KEY constraintsAccepted in DDL (parsed)Foreign keys are not enforced at query time. They are syntactically accepted for migration compatibility.
MATERIALIZED VIEWNot supportedUse regular CREATE VIEW with HatiData's query cache for performance.
ON CONFLICT DO UPDATE (UPSERT)Supported (DuckDB native)DuckDB supports INSERT ... ON CONFLICT DO UPDATE. Works as expected.

Snowflake SQL Transpilation

HatiData automatically transpiles Snowflake-dialect SQL to DuckDB-compatible SQL. This happens transparently in the query pipeline. Transpilation results are cached by SQL hash, so repeated queries hit sub-millisecond cache lookups.

Function Mappings

Snowflake FunctionHatiData/DuckDB EquivalentNotes
IFF(cond, true_val, false_val)IF(cond, true_val, false_val)Direct rename
NVL(expr, default)COALESCE(expr, default)Direct rename
NVL2(expr, not_null, null_val)CASE WHEN expr IS NOT NULL THEN not_null ELSE null_val ENDExpanded to CASE expression
DATEDIFF(part, start, end)DATE_DIFF(part, start, end)Underscore added
DATEADD(part, n, date)date + INTERVAL 'n' partRewritten to interval arithmetic
TO_TIMESTAMP_NTZ(expr)CAST(expr AS TIMESTAMP)Rewritten to CAST
PARSE_JSON(expr)CAST(expr AS JSON)Rewritten to CAST
TO_VARIANT(expr)CAST(expr AS JSON)Rewritten to CAST
OBJECT_CONSTRUCT(k, v, ...)json_object(k, v, ...)Direct rename
GET_PATH(expr, 'path')json_extract(expr, 'path')Direct rename
FLATTEN(input => expr)UNNEST(expr)Extracts input parameter
LISTAGG(expr, delim)STRING_AGG(expr, delim)Handles both Expr::Function and Expr::ListAgg AST forms
ARRAY_AGG(expr)LIST(expr)Handles both Expr::Function and Expr::ArrayAgg AST forms
SPLIT_PART(str, delim, n)SPLIT_PART(str, delim, n)Passthrough (native DuckDB)
REGEXP_REPLACE(str, pat, rep)REGEXP_REPLACE(str, pat, rep)Passthrough (native DuckDB)
COALESCE(...)COALESCE(...)Passthrough
DECODE(expr, s1, r1, ...)DECODE(...)Passthrough (native DuckDB)
DATE_TRUNC('part', expr)DATE_TRUNC('part', expr)Passthrough
TRY_CAST(expr AS type)TRY_CAST(expr AS type)Passthrough

Type Mappings

Snowflake-specific types in CREATE TABLE, ALTER TABLE, and CAST expressions are automatically rewritten:

Snowflake TypeDuckDB TypeNotes
VARIANTJSONUniversal semi-structured type
OBJECTJSONKey-value object type
ARRAY (untyped)JSONTyped ARRAY(T) is preserved
NUMBER / NUMBER(p,s)NUMERICPrecision/scale simplified
TIMESTAMP_NTZTIMESTAMPTimestamp without timezone
TIMESTAMP_LTZTIMESTAMPTZTimestamp with local timezone

Snowflake Syntax Transpilation

Snowflake SyntaxHatiData EquivalentNotes
col:field (colon notation)json_extract_string(col, 'field')Pre-parse normalization. Multi-level access chains are recursively expanded.
col:field::TYPECAST(json_extract_string(col, 'field') AS TYPE)Colon notation with type cast
TABLE(FLATTEN(input => expr))UNNEST(expr)FLATTEN parameters path, outer, recursive, mode are not supported.
CLONENot supportedUse CREATE TABLE ... AS SELECT * FROM ... for full data copy.

For detailed examples and edge cases, see SQL Functions & Types.


Vector Search Functions (HatiData-Specific)

HatiData extends standard SQL with vector-aware functions for semantic search. These are resolved at query time: the proxy embeds the search text and evaluates cosine similarity against the vector index.

semantic_match()

Returns TRUE if cosine similarity exceeds a threshold. Use in WHERE and JOIN ON clauses.

-- Default threshold (0.7)
SELECT * FROM documents
WHERE semantic_match(content, 'billing dispute resolution');

-- Custom threshold
SELECT * FROM support_tickets
WHERE semantic_match(description, 'refund request', 0.85)
AND status = 'open';

Signature: semantic_match(column, 'search text' [, threshold FLOAT = 0.7])

vector_match() is an alias for semantic_match.

semantic_rank()

Returns the cosine similarity score as a FLOAT (0.0 to 1.0). Use in SELECT and ORDER BY.

SELECT content,
semantic_rank(content, 'revenue growth Q4') AS relevance
FROM _hatidata_agent_memory
WHERE agent_id = 'analyst'
ORDER BY relevance DESC
LIMIT 10;

JOIN_VECTOR

Syntactic sugar combining a JOIN with semantic_match for readable intent:

SELECT t.ticket_id, m.content AS related_memory
FROM support_tickets t
JOIN_VECTOR _hatidata_agent_memory m
ON semantic_match(m.content, t.subject, 0.7)
ORDER BY semantic_rank(m.content, t.subject) DESC;
info

Vector functions require an embedding provider configured on the proxy. Without one, queries using these functions return an error.


FeatureStatusNotes
LIKE / ILIKESupportedStandard pattern matching with % and _ wildcards.
contains(col, 'text')SupportedDuckDB built-in substring search.
regexp_matches(col, 'pattern')SupportedDuckDB built-in regex matching. No lookbehind support (Rust regex crate limitation).
similarity()Not availableUse semantic_rank() for vector-based similarity scoring.
pg_trgm extensionNot availablePostgreSQL-specific extension. Use ILIKE or semantic_match() instead.

System Catalog Support

HatiData presents itself as PostgreSQL 15.0 to connected clients and implements a compatibility layer for system catalog queries.

CatalogStatusDetails
information_schema.tablesReal dataPasses through to DuckDB. Returns actual table metadata.
information_schema.columnsReal dataPasses through to DuckDB. Returns actual column metadata.
information_schema.schemataReal dataPasses through to DuckDB.
pg_classReal dataPasses through to DuckDB's pg_catalog. Supports \dt and BI tool table discovery.
pg_attributeReal dataPasses through to DuckDB's pg_catalog. Supports \d <table> column inspection.
pg_tablesReal dataPasses through to DuckDB's pg_catalog.
pg_typeMocked (36 types)Returns 36 common Postgres type definitions. Covers all types that BI tools (Tableau, Metabase, DBeaver, dbt) query.
pg_namespaceMockedReturns pg_catalog, public, and information_schema schemas.
pg_databaseMockedReturns a single hatidata database entry.
pg_roles / pg_userMockedReturns a single hatidata superuser.
pg_settingsMockedReturns server_version (15.0) and max_connections (100).
pg_procLimitedDuckDB's built-in functions, not Postgres function catalog.
pg_descriptionEmptyReturns empty result set.
pg_constraintEmptyReturns empty result set (foreign keys are not enforced).
pg_indexEmptyReturns empty result set.
pg_amEmptyReturns empty result set.
pg_stat_*Not availableStatistics views are not implemented. Returns empty result set.

Supported Session Commands

CommandBehavior
SET <parameter> = <value>Acknowledged (returns SET). All SET commands are accepted but most are no-ops.
SHOW server_versionReturns 15.0
SHOW server_encodingReturns UTF8
SHOW client_encodingReturns UTF8
SHOW timezoneReturns UTC
SHOW search_pathReturns main
SHOW datestyleReturns ISO, MDY
SHOW standard_conforming_stringsReturns on
SHOW transaction_isolationReturns read committed
SHOW max_identifier_lengthReturns 63
SHOW ALLReturns a table of key server settings
SELECT version()Returns PostgreSQL 15.0 (HatiData Proxy)
SELECT current_database()Returns hatidata
SELECT current_schema()Returns public
SELECT current_userReturns hatidata
DEALLOCATE / DEALLOCATE ALLAcknowledged (no-op, returns DEALLOCATE)

Known Differences from PostgreSQL

  1. Auto-commit mode --- Each SQL statement is its own transaction. BEGIN, COMMIT, and ROLLBACK are accepted for compatibility but provide no transaction isolation. A NOTICE is sent when BEGIN is issued.

  2. TIMESTAMP is always UTC internally --- DuckDB stores timestamps in UTC. TIMESTAMPTZ values are converted to UTC on ingestion. Timezone-aware queries should use explicit AT TIME ZONE conversions.

  3. Foreign keys are parsed but not enforced --- CREATE TABLE with FOREIGN KEY constraints is accepted syntactically (for migration compatibility), but referential integrity is not checked at insert/update time.

  4. NUMERIC uses text encoding --- Binary wire format for NUMERIC is not implemented (Postgres NUMERIC binary is BCD-encoded with a complex header). NUMERIC columns always use text format in result rows. This is transparent to most clients.

  5. JSON and JSONB are equivalent --- Both are stored as JSON text internally. There is no separate binary JSON storage format. JSONB columns are reported with OID 3802 for wire protocol compatibility, but storage and query behavior is identical.

  6. ARRAY columns use Postgres text format --- Array values are formatted as Postgres array text literals (e.g., {1,2,3}) but may exhibit behavioral differences in edge cases (NULL handling, multi-dimensional arrays).

  7. String concatenation --- || works for string concatenation. + does not concatenate strings (it is arithmetic only).

  8. No sequences or auto-increment --- SERIAL, BIGSERIAL, CREATE SEQUENCE, nextval(), and currval() are not supported. Use application-generated UUIDs or IDs.

  9. Microsecond timestamp precision --- Snowflake's TIMESTAMP_NTZ(9) nanosecond precision is not preserved. Timestamps have microsecond precision.

  10. DuckDB-specific functions available --- DuckDB built-in functions (list_aggregate, list_sort, regexp_extract, strftime, read_csv, read_parquet, etc.) are available alongside standard SQL and transpiled Snowflake functions.


Connection String Reference

postgres://USERNAME:API_KEY@HOST:5439/DATABASE?sslmode=require
ParameterDescriptionExample
USERNAMEAny string (not validated; API key is the authentication mechanism)admin
API_KEYHatiData API key. Exactly 40 characters: prefix (hd_live_ or hd_test_) + 32 alphanumeric chars. Agent keys use hd_agent_ prefix.hd_live_abc123def456ghi789jkl012mno345pq
HOSTYour HatiData proxy endpointdata.yourcompany.com
PORTAlways 54395439
DATABASEDatabase name. Use hatidata for the default catalog.hatidata
sslmoderequire for production, disable for local developmentrequire

Examples:

# Production (cloud)
psql "postgres://admin:hd_live_your_api_key@data.yourcompany.com:5439/hatidata?sslmode=require"

# Local development
psql -h localhost -p 5439 -U admin -d hatidata

# Environment variables
export PGHOST=data.yourcompany.com
export PGPORT=5439
export PGDATABASE=hatidata
export PGUSER=admin
export PGPASSWORD=hd_live_your_api_key
export PGSSLMODE=require
psql

Known Unsupported Snowflake Features

The following Snowflake-specific features are not supported by HatiData's transpiler. Each has a recommended workaround that achieves the same result using standard SQL or HatiData-native features.

FeatureStatusWorkaround
MERGE statementNot supportedUse INSERT ... ON CONFLICT DO UPDATE for upserts, or separate INSERT and UPDATE statements.
LATERAL FLATTEN with nested arraysPartialSingle-level FLATTEN (transpiled to UNNEST) works. Nested/recursive FLATTEN with path, outer, or recursive parameters is not supported. Flatten one level at a time with chained UNNEST calls.
CONNECT BY / PRIORNot supportedUse recursive CTEs (WITH RECURSIVE) which are fully supported by DuckDB.
PIVOT / UNPIVOTNot supportedRewrite as CASE expressions with aggregation. For example: SUM(CASE WHEN category = 'A' THEN amount END) AS a_total.
Stored procedures (CREATE PROCEDURE)Not applicableHatiData is a query engine, not a procedural runtime. Implement logic in your agent code or orchestration layer.
Streams and TasksNot supportedUse Semantic Triggers for event-driven, content-aware workflows that replace Snowflake's polling-based Streams/Tasks model.
CREATE STAGE / PUT / GETNot supportedLoad data via INSERT ... SELECT, the Python/TypeScript SDK, or DuckDB's built-in read_csv() / read_parquet() functions.
tip

If you are migrating from Snowflake and encounter an unsupported feature not listed here, the proxy returns a descriptive error message identifying the unsupported syntax. See the Snowflake Migration Guide for a step-by-step migration walkthrough.


Transpilation Performance

HatiData's Snowflake-to-DuckDB transpiler operates entirely in-memory using a parsed AST (abstract syntax tree). Transpilation results are cached by SQL hash, so repeated queries skip transpilation entirely and resolve in sub-microsecond cache lookups.

Benchmarks by Query Complexity

ComplexityExampleAvg Transpilation TimeCache Lookup
SimpleSELECT col FROM table WHERE x = 1 with IFF, NVL rewrites< 0.1 ms< 0.01 ms
MediumMulti-table JOINs with DATEDIFF, FLATTEN, colon notation, type mappings< 0.5 ms< 0.01 ms
ComplexDeeply nested subqueries, multiple window functions with QUALIFY, chained CTEs, LISTAGG + ARRAY_AGG< 2 ms< 0.01 ms

What Makes a Query "Complex"

Transpilation time scales with AST depth and the number of rewrite rules that fire. The following patterns increase transpilation cost:

  • Deeply nested subqueries — Each nesting level requires a full recursive AST traversal for function and type rewrites.
  • Multiple JOINs combined with QUALIFY — The transpiler must rewrite QUALIFY into a window-function subquery wrapper when targeting older SQL dialects, though DuckDB supports QUALIFY natively (passthrough).
  • Chained colon notation — Expressions like col:a:b:c::INT require recursive pre-parse expansion before AST construction.
  • Mixed Snowflake functions — Queries using many distinct Snowflake functions (e.g., IFF, NVL2, DATEDIFF, FLATTEN, OBJECT_CONSTRUCT in the same query) trigger more rewrite passes.
  • Large IN lists or UNION ALL chains — These increase AST node count but do not require complex rewrites. Transpilation is still fast, but parsing dominates.

In practice, transpilation is never the bottleneck. DuckDB query execution and network round-trips dominate end-to-end latency by 10-100x.


Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.