Skip to main content

MCP Tools Reference

HatiData's MCP server exposes 24 tools organized into five categories: SQL, Memory, Chain-of-Thought, Triggers, and Branches. Every tool call goes through HatiData's full security pipeline (authentication, ABAC policy evaluation, audit logging, metering).

For MCP server setup and configuration, see MCP Server.

SQL Tools (7)

Tools for querying the data warehouse and exploring schemas.


query

Execute a SQL query against the warehouse and return results as JSON.

Parameters:

NameTypeRequiredDescription
sqlstringYesSQL query to execute

Returns: Array of row objects.

Example:

// Input
{
"sql": "SELECT customer_id, SUM(total) as revenue FROM orders GROUP BY 1 ORDER BY 2 DESC LIMIT 5"
}

// Output
[
{"customer_id": 42, "revenue": 125000.00},
{"customer_id": 17, "revenue": 98500.00},
{"customer_id": 8, "revenue": 87200.00},
{"customer_id": 103, "revenue": 76400.00},
{"customer_id": 55, "revenue": 71100.00}
]

Supports both standard SQL and legacy warehouse SQL syntax (NVL, IFF, DATEDIFF, etc.), which is auto-transpiled. Queries pass through the full 13-step pipeline: policy evaluation, cost estimation, quota checks, row-level security, transpilation, DuckDB execution, AI healing on failure, column masking, and audit logging.


list_tables

List all tables the authenticated agent has permission to access.

Parameters: None.

Returns: Array of table name strings.

Example:

// Input
{}

// Output
["customers", "orders", "products", "events", "knowledge_base"]

Results are filtered by ABAC policies -- agents only see tables they are authorized to access.


describe_table

Get column names, data types, and nullability for a specific table.

Parameters:

NameTypeRequiredDescription
table_namestringYesName of the table to describe

Returns: Array of column descriptor objects.

Example:

// Input
{"table_name": "orders"}

// Output
[
{"column_name": "id", "data_type": "INTEGER", "is_nullable": "NO"},
{"column_name": "customer_id", "data_type": "INTEGER", "is_nullable": "NO"},
{"column_name": "total", "data_type": "DECIMAL(10,2)", "is_nullable": "YES"},
{"column_name": "status", "data_type": "VARCHAR", "is_nullable": "YES"},
{"column_name": "created_at", "data_type": "TIMESTAMP", "is_nullable": "YES"}
]

Queries information_schema.columns under the hood. Columns masked by policy are excluded from the result.


list_schemas

List all schemas in the database.

Parameters: None.

Returns: Array of schema name strings.

Example:

// Input
{}

// Output
["main", "staging", "analytics"]

read_query

Execute a read-only SQL query against the warehouse. Identical to query but enforces that the statement is a SELECT (no INSERT, UPDATE, DELETE, or DDL). Useful for agents that should only have read access.

Parameters:

NameTypeRequiredDescription
sqlstringYesSELECT query to execute

Returns: Array of row objects.

Example:

// Input
{"sql": "SELECT segment, COUNT(*) as customer_count FROM customers GROUP BY segment"}

// Output
[
{"segment": "enterprise", "customer_count": 142},
{"segment": "mid_market", "customer_count": 891},
{"segment": "smb", "customer_count": 3204}
]

Non-SELECT statements are rejected with a POLICY_VIOLATION error. This tool is intended for agents configured with read-only access.


get_usage_stats

Get query usage statistics and credit consumption for the current agent.

Parameters:

NameTypeRequiredDescription
periodstringNoTime period: "today", "week", "month" (default: "month")

Returns: Usage summary object.

Example:

// Input
{"period": "month"}

// Output
{
"agent_id": "analyst-agent",
"period": "2025-12",
"query_count": 1247,
"credits_used": 3891.5,
"credits_remaining": 96108.5,
"avg_duration_ms": 38,
"tables_accessed": ["customers", "orders", "products", "events"]
}

Returns usage metrics scoped to the authenticated agent. Organization-level usage requires audit:read scope.


run_sql_arrow

Execute a read-only SQL query and return results as base64-encoded Apache Arrow IPC. Designed for efficient data transfer to pandas, polars, and other columnar data frameworks. Requires QueryRead scope.

Parameters:

NameTypeRequiredDescription
sqlstringYesSELECT query to execute
max_rowsintegerNoMaximum rows to return (default: 10000)

Returns: Object containing base64-encoded Arrow IPC stream data.

Example:

// Input
{
"sql": "SELECT customer_id, SUM(total) as revenue, COUNT(*) as order_count FROM orders GROUP BY 1 ORDER BY 2 DESC LIMIT 1000"
}

// Output
{
"format": "arrow_ipc_base64",
"data": "QVJST1cxAAAAAAoAAAAAAAD...",
"rows": 1000,
"execution_time_ms": 12
}

The data field contains a base64-encoded Apache Arrow IPC stream. Decode and load directly into pandas or polars:

import base64
import pyarrow.ipc as ipc
import io

# Decode the Arrow IPC response
ipc_bytes = base64.b64decode(result["data"])
reader = ipc.open_stream(io.BytesIO(ipc_bytes))
table = reader.read_all()

# Convert to pandas or polars
df_pandas = table.to_pandas()

import polars as pl
df_polars = pl.from_arrow(table)

Arrow IPC preserves full type fidelity (integers, floats, timestamps, booleans, nulls) without the string parsing overhead of JSON results. This makes run_sql_arrow significantly faster than query for large result sets consumed by data analysis code.

For the HTTP endpoint equivalent, see Arrow Queries.


Memory Tools (5)

Tools for storing, searching, and managing long-term agent memory. See Agent Memory for architecture details.


store_memory

Store a new memory entry with optional metadata and importance scoring.

Parameters:

NameTypeRequiredDescription
contentstringYesThe memory content (natural language or structured data)
memory_typestringNoCategory: "fact", "episode", "preference", "context" (default: "fact")
metadataobjectNoArbitrary key-value metadata
importancefloatNoImportance score from 0.0 to 1.0 (default: 0.5)

Returns: Memory creation confirmation with ID.

Example:

// Input
{
"content": "Enterprise customers have a 95% renewal rate in Q4",
"memory_type": "fact",
"metadata": {
"source": "quarterly-review",
"quarter": "Q4-2025"
},
"importance": 0.9
}

// Output
{
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "stored",
"has_embedding": false
}

The has_embedding field becomes true once the async embedding worker processes the entry (typically within seconds).


search_memory

Search memories using natural language or keyword queries with optional filters.

Parameters:

NameTypeRequiredDescription
querystringYesNatural language search query
top_kintegerNoMaximum results to return (default: 10)
memory_typestringNoFilter by memory type
min_importancefloatNoMinimum importance score filter
agent_idstringNoFilter by agent (default: current agent)

Returns: Array of memory objects ranked by similarity.

Example:

// Input
{
"query": "customer renewal rates",
"top_k": 5,
"memory_type": "fact",
"min_importance": 0.5
}

// Output
[
{
"memory_id": "a1b2c3d4-...",
"content": "Enterprise customers have a 95% renewal rate in Q4",
"memory_type": "fact",
"importance": 0.9,
"similarity_score": 0.94,
"metadata": {"source": "quarterly-review", "quarter": "Q4-2025"},
"created_at": "2025-12-15T10:30:00Z"
},
{
"memory_id": "b2c3d4e5-...",
"content": "SMB segment renewal rate dropped to 72% in Q3",
"memory_type": "fact",
"importance": 0.8,
"similarity_score": 0.87,
"metadata": {"source": "churn-analysis"},
"created_at": "2025-10-01T14:20:00Z"
}
]

When vector search is available, similarity_score is cosine similarity. In fallback mode, it is a heuristic relevance score.


get_agent_state

Retrieve a named state value for the current agent.

Parameters:

NameTypeRequiredDescription
keystringYesState key name

Returns: State key-value pair with timestamp.

Example:

// Input
{"key": "current_task"}

// Output
{
"key": "current_task",
"value": "Analyzing Q4 revenue data for the board presentation",
"updated_at": "2025-12-15T14:20:00Z"
}

Agent state is stored in the _hatidata_agent_state table, scoped by agent_id.


set_agent_state

Set or update a named state value for the current agent.

Parameters:

NameTypeRequiredDescription
keystringYesState key name
valuestringYesState value

Returns: Confirmation.

Example:

// Input
{
"key": "current_task",
"value": "Generating the final report"
}

// Output
{"status": "ok", "key": "current_task"}

delete_memory

Delete a specific memory entry by ID. This operation is irreversible -- it removes both the metadata record and the vector embedding.

Parameters:

NameTypeRequiredDescription
memory_idstringYesUUID of the memory to delete

Returns: Deletion confirmation.

Example:

// Input
{"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}

// Output
{"status": "deleted", "memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}

Chain-of-Thought Tools (3)

Tools for recording and replaying agent reasoning traces. See Chain-of-Thought Ledger for architecture details.


log_reasoning_step

Log a single step in an agent's reasoning chain. Steps are hash-chained (SHA-256) per session for tamper detection.

Parameters:

NameTypeRequiredDescription
session_idstringYesSession identifier for the reasoning chain
step_typestringYesOne of: observation, thought, action, result, reflection, planning, tool_call, tool_result, decision, error, correction, summary
contentstringYesContent of the reasoning step
metadataobjectNoAdditional context (tool names, confidence scores, etc.)
importancefloatNoStep importance for embedding sampling (default: 0.5)

Returns: Step confirmation with hash chain data.

Example:

// Input
{
"session_id": "analysis-2025-12-15",
"step_type": "thought",
"content": "Revenue is up 15% but concentrated in enterprise segment. Need to check SMB trends.",
"metadata": {"confidence": 0.85},
"importance": 0.7
}

// Output
{
"trace_id": "f1e2d3c4-...",
"step_number": 3,
"hash": "a7b8c9d0e1f2...",
"prev_hash": "1234abcd5678...",
"session_id": "analysis-2025-12-15"
}

Steps with step_type of decision, error, or correction are always embedded for semantic search regardless of the sampling rate.


replay_decision

Replay the full reasoning chain for a specific session, including hash verification.

Parameters:

NameTypeRequiredDescription
session_idstringYesSession identifier to replay
verify_chainbooleanNoVerify SHA-256 hash chain integrity (default: true)

Returns: Array of reasoning steps with chain verification status.

Example:

// Input
{"session_id": "analysis-2025-12-15", "verify_chain": true}

// Output
{
"session_id": "analysis-2025-12-15",
"chain_valid": true,
"step_count": 8,
"steps": [
{
"step_number": 0,
"step_type": "observation",
"content": "User asked about Q4 revenue trends",
"hash": "1234abcd...",
"prev_hash": null,
"timestamp": "2025-12-15T10:00:00Z"
},
{
"step_number": 1,
"step_type": "planning",
"content": "Plan: 1) Query revenue by segment 2) Compare Q3 vs Q4 3) Identify drivers",
"hash": "5678efgh...",
"prev_hash": "1234abcd...",
"timestamp": "2025-12-15T10:00:01Z"
}
]
}

If chain_valid is false, the hash chain has been tampered with. The response will indicate which step has an invalid hash.


get_session_history

List reasoning sessions for the current agent, with optional filtering.

Parameters:

NameTypeRequiredDescription
limitintegerNoMaximum sessions to return (default: 20)
agent_idstringNoFilter by agent (default: current agent)
sincestringNoISO 8601 timestamp to filter sessions after

Returns: Array of session summaries.

Example:

// Input
{"limit": 5}

// Output
[
{
"session_id": "analysis-2025-12-15",
"step_count": 8,
"first_step_at": "2025-12-15T10:00:00Z",
"last_step_at": "2025-12-15T10:05:30Z",
"step_types": ["observation", "planning", "tool_call", "tool_result", "thought", "decision", "summary"]
},
{
"session_id": "report-2025-12-14",
"step_count": 12,
"first_step_at": "2025-12-14T15:30:00Z",
"last_step_at": "2025-12-14T15:42:00Z",
"step_types": ["observation", "thought", "action", "result", "reflection", "summary"]
}
]

Trigger Tools (4)

Tools for managing semantic triggers -- event-driven automations that fire when content matches a concept. See Semantic Triggers for architecture details.


register_trigger

Register a new semantic trigger that fires when content exceeds a similarity threshold to a concept.

Parameters:

NameTypeRequiredDescription
namestringYesHuman-readable trigger name
conceptstringYesNatural language concept to match against
thresholdfloatNoCosine similarity threshold (default: 0.8)
actionobjectYesAction to take when triggered
cooldown_secondsintegerNoMinimum seconds between firings (default: 60)

Action types:

Action TypeFieldsDescription
webhookurl, headersSend HMAC-SHA256 signed POST to URL
agent_notifyagent_id, messageAdd to agent's offline notification inbox
write_eventevent_type, payloadWrite a structured event to the audit log
flag_for_reviewreason, priorityFlag content for human review in the dashboard

Returns: Trigger registration confirmation.

Example:

// Input
{
"name": "fraud-pattern-detector",
"concept": "unusual transaction pattern indicative of fraud or money laundering",
"threshold": 0.85,
"action": {
"type": "webhook",
"url": "https://alerts.example.com/fraud",
"headers": {"X-Source": "hatidata-trigger"}
},
"cooldown_seconds": 300
}

// Output
{
"trigger_id": "trig_a1b2c3d4",
"name": "fraud-pattern-detector",
"status": "active"
}

list_triggers

List all registered triggers for the current organization.

Parameters:

NameTypeRequiredDescription
statusstringNoFilter by status: "active", "paused", "all" (default: "active")

Returns: Array of trigger objects.

Example:

// Input
{"status": "active"}

// Output
[
{
"trigger_id": "trig_a1b2c3d4",
"name": "fraud-pattern-detector",
"concept": "unusual transaction pattern indicative of fraud or money laundering",
"threshold": 0.85,
"action": {"type": "webhook", "url": "https://alerts.example.com/fraud"},
"cooldown_seconds": 300,
"status": "active",
"last_fired_at": "2025-12-14T08:30:00Z",
"fire_count": 7
}
]

delete_trigger

Delete a registered trigger.

Parameters:

NameTypeRequiredDescription
trigger_idstringYesID of the trigger to delete

Returns: Deletion confirmation.

Example:

// Input
{"trigger_id": "trig_a1b2c3d4"}

// Output
{"status": "deleted", "trigger_id": "trig_a1b2c3d4"}

test_trigger

Test a trigger against sample content without actually firing the action.

Parameters:

NameTypeRequiredDescription
trigger_idstringYesID of the trigger to test
contentstringYesSample content to evaluate

Returns: Evaluation result with similarity score.

Example:

// Input
{
"trigger_id": "trig_a1b2c3d4",
"content": "Multiple high-value wire transfers to newly created offshore accounts within 24 hours"
}

// Output
{
"trigger_id": "trig_a1b2c3d4",
"would_fire": true,
"similarity_score": 0.92,
"threshold": 0.85,
"action_type": "webhook"
}

Branch Tools (5)

Tools for creating isolated data branches for speculative agent operations. See State Branching for architecture details.


branch_create

Create a new isolated branch from the main data. Branches use DuckDB schema-based isolation with zero-copy views on creation.

Parameters:

NameTypeRequiredDescription
namestringNoHuman-readable branch name
descriptionstringNoPurpose of the branch
ttl_secondsintegerNoAuto-expire after N seconds (default: 3600)

Returns: Branch creation confirmation.

Example:

// Input
{
"name": "what-if-pricing-change",
"description": "Simulate the effect of a 10% price increase on enterprise tier",
"ttl_seconds": 7200
}

// Output
{
"branch_id": "br_a1b2c3d4",
"name": "what-if-pricing-change",
"schema": "branch_a1b2c3d4",
"status": "active",
"expires_at": "2025-12-15T12:00:00Z"
}

branch_query

Execute a SQL query within a branch's isolated schema.

Parameters:

NameTypeRequiredDescription
branch_idstringYesBranch to query
sqlstringYesSQL query to execute

Returns: Array of row objects (same format as query).

Example:

// Input
{
"branch_id": "br_a1b2c3d4",
"sql": "UPDATE products SET price = price * 1.10 WHERE tier = 'enterprise'; SELECT tier, AVG(price) as avg_price FROM products GROUP BY tier;"
}

// Output
[
{"tier": "enterprise", "avg_price": 1100.00},
{"tier": "growth", "avg_price": 500.00},
{"tier": "free", "avg_price": 0.00}
]

Writes within a branch are copy-on-write -- the first write to a table materializes a copy from main. Subsequent queries see the modified data without affecting the main schema.


branch_merge

Merge a branch's changes back into the main schema.

Parameters:

NameTypeRequiredDescription
branch_idstringYesBranch to merge
strategystringNoConflict resolution: "branch_wins", "main_wins", "abort" (default: "branch_wins")

Returns: Merge result with conflict details.

Example:

// Input
{
"branch_id": "br_a1b2c3d4",
"strategy": "branch_wins"
}

// Output
{
"status": "merged",
"branch_id": "br_a1b2c3d4",
"tables_merged": ["products"],
"conflicts": 0
}

If conflicts are detected and the strategy is "abort", no changes are applied and the conflicts are returned for review.


branch_discard

Discard a branch and clean up its isolated schema.

Parameters:

NameTypeRequiredDescription
branch_idstringYesBranch to discard

Returns: Discard confirmation.

Example:

// Input
{"branch_id": "br_a1b2c3d4"}

// Output
{"status": "discarded", "branch_id": "br_a1b2c3d4"}

branch_list

List all active branches for the current agent/organization.

Parameters:

NameTypeRequiredDescription
statusstringNoFilter: "active", "merged", "discarded", "all" (default: "active")

Returns: Array of branch summaries.

Example:

// Input
{"status": "active"}

// Output
[
{
"branch_id": "br_a1b2c3d4",
"name": "what-if-pricing-change",
"description": "Simulate the effect of a 10% price increase on enterprise tier",
"status": "active",
"materialized_tables": ["products"],
"created_at": "2025-12-15T10:00:00Z",
"expires_at": "2025-12-15T12:00:00Z"
}
]

Error Handling

All tools return errors in a consistent format:

{
"error": {
"code": "POLICY_VIOLATION",
"message": "Agent 'analyst' is not permitted to execute DDL statements",
"request_id": "req_a1b2c3d4"
}
}

Common error codes:

CodeDescription
QUERY_ERRORSQL syntax error or execution failure
POLICY_VIOLATIONABAC policy blocked the operation
QUOTA_EXCEEDEDAgent has exceeded its credit quota
NOT_FOUNDRequested resource (memory, branch, trigger) does not exist
BRANCH_EXPIREDBranch TTL has expired
CHAIN_INVALIDCoT hash chain verification failed

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.