MCP Tools Reference

HatiData's MCP server exposes 24 tools organized into five categories: SQL, Memory, Chain-of-Thought, Triggers, and Branches. Every tool call goes through HatiData's full security pipeline (authentication, ABAC policy evaluation, audit logging, metering).

For MCP server setup and configuration, see MCP Server.

SQL Tools (7)

Tools for querying the data warehouse and exploring schemas.

`query`

Execute a SQL query against the warehouse and return results as JSON.

Parameters:

Name	Type	Required	Description
`sql`	string	Yes	SQL query to execute

Returns: Array of row objects.

Example:

// Input
{
  "sql": "SELECT customer_id, SUM(total) as revenue FROM orders GROUP BY 1 ORDER BY 2 DESC LIMIT 5"
}

// Output
[
  {"customer_id": 42, "revenue": 125000.00},
  {"customer_id": 17, "revenue": 98500.00},
  {"customer_id": 8, "revenue": 87200.00},
  {"customer_id": 103, "revenue": 76400.00},
  {"customer_id": 55, "revenue": 71100.00}
]

Supports both standard SQL and legacy warehouse SQL syntax (NVL, IFF, DATEDIFF, etc.), which is auto-transpiled. Queries pass through the full 13-step pipeline: policy evaluation, cost estimation, quota checks, row-level security, transpilation, DuckDB execution, AI healing on failure, column masking, and audit logging.

`list_tables`

List all tables the authenticated agent has permission to access.

Parameters: None.

Returns: Array of table name strings.

Example:

// Input
{}

// Output
["customers", "orders", "products", "events", "knowledge_base"]

Results are filtered by ABAC policies -- agents only see tables they are authorized to access.

`describe_table`

Get column names, data types, and nullability for a specific table.

Parameters:

Name	Type	Required	Description
`table_name`	string	Yes	Name of the table to describe

Returns: Array of column descriptor objects.

Example:

// Input
{"table_name": "orders"}

// Output
[
  {"column_name": "id", "data_type": "INTEGER", "is_nullable": "NO"},
  {"column_name": "customer_id", "data_type": "INTEGER", "is_nullable": "NO"},
  {"column_name": "total", "data_type": "DECIMAL(10,2)", "is_nullable": "YES"},
  {"column_name": "status", "data_type": "VARCHAR", "is_nullable": "YES"},
  {"column_name": "created_at", "data_type": "TIMESTAMP", "is_nullable": "YES"}
]

Queries information_schema.columns under the hood. Columns masked by policy are excluded from the result.

`list_schemas`

List all schemas in the database.

Parameters: None.

Returns: Array of schema name strings.

Example:

// Input
{}

// Output
["main", "staging", "analytics"]

`read_query`

Execute a read-only SQL query against the warehouse. Identical to query but enforces that the statement is a SELECT (no INSERT, UPDATE, DELETE, or DDL). Useful for agents that should only have read access.

Parameters:

Name	Type	Required	Description
`sql`	string	Yes	SELECT query to execute

Returns: Array of row objects.

Example:

// Input
{"sql": "SELECT segment, COUNT(*) as customer_count FROM customers GROUP BY segment"}

// Output
[
  {"segment": "enterprise", "customer_count": 142},
  {"segment": "mid_market", "customer_count": 891},
  {"segment": "smb", "customer_count": 3204}
]

Non-SELECT statements are rejected with a POLICY_VIOLATION error. This tool is intended for agents configured with read-only access.

`get_usage_stats`

Get query usage statistics and credit consumption for the current agent.

Parameters:

Name	Type	Required	Description
`period`	string	No	Time period: `"today"`, `"week"`, `"month"` (default: `"month"`)

Returns: Usage summary object.

Example:

// Input
{"period": "month"}

// Output
{
  "agent_id": "analyst-agent",
  "period": "2025-12",
  "query_count": 1247,
  "credits_used": 3891.5,
  "credits_remaining": 96108.5,
  "avg_duration_ms": 38,
  "tables_accessed": ["customers", "orders", "products", "events"]
}

Returns usage metrics scoped to the authenticated agent. Organization-level usage requires audit:read scope.

`run_sql_arrow`

Execute a read-only SQL query and return results as base64-encoded Apache Arrow IPC. Designed for efficient data transfer to pandas, polars, and other columnar data frameworks. Requires QueryRead scope.

Parameters:

Name	Type	Required	Description
`sql`	string	Yes	SELECT query to execute
`max_rows`	integer	No	Maximum rows to return (default: `10000`)

Returns: Object containing base64-encoded Arrow IPC stream data.

Example:

// Input
{
  "sql": "SELECT customer_id, SUM(total) as revenue, COUNT(*) as order_count FROM orders GROUP BY 1 ORDER BY 2 DESC LIMIT 1000"
}

// Output
{
  "format": "arrow_ipc_base64",
  "data": "QVJST1cxAAAAAAoAAAAAAAD...",
  "rows": 1000,
  "execution_time_ms": 12
}

The data field contains a base64-encoded Apache Arrow IPC stream. Decode and load directly into pandas or polars:

import base64
import pyarrow.ipc as ipc
import io

# Decode the Arrow IPC response
ipc_bytes = base64.b64decode(result["data"])
reader = ipc.open_stream(io.BytesIO(ipc_bytes))
table = reader.read_all()

# Convert to pandas or polars
df_pandas = table.to_pandas()

import polars as pl
df_polars = pl.from_arrow(table)

Arrow IPC preserves full type fidelity (integers, floats, timestamps, booleans, nulls) without the string parsing overhead of JSON results. This makes run_sql_arrow significantly faster than query for large result sets consumed by data analysis code.

For the HTTP endpoint equivalent, see Arrow Queries.

Memory Tools (5)

Tools for storing, searching, and managing long-term agent memory. See Agent Memory for architecture details.

`store_memory`

Store a new memory entry with optional metadata and importance scoring.

Parameters:

Name	Type	Required	Description
`content`	string	Yes	The memory content (natural language or structured data)
`memory_type`	string	No	Category: `"fact"`, `"episode"`, `"preference"`, `"context"` (default: `"fact"`)
`metadata`	object	No	Arbitrary key-value metadata
`importance`	float	No	Importance score from 0.0 to 1.0 (default: `0.5`)

Returns: Memory creation confirmation with ID.

Example:

// Input
{
  "content": "Enterprise customers have a 95% renewal rate in Q4",
  "memory_type": "fact",
  "metadata": {
    "source": "quarterly-review",
    "quarter": "Q4-2025"
  },
  "importance": 0.9
}

// Output
{
  "memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "stored",
  "has_embedding": false
}

The has_embedding field becomes true once the async embedding worker processes the entry (typically within seconds).

`search_memory`

Search memories using natural language or keyword queries with optional filters.

Parameters:

Name	Type	Required	Description
`query`	string	Yes	Natural language search query
`top_k`	integer	No	Maximum results to return (default: `10`)
`memory_type`	string	No	Filter by memory type
`min_importance`	float	No	Minimum importance score filter
`agent_id`	string	No	Filter by agent (default: current agent)

Returns: Array of memory objects ranked by similarity.

Example:

// Input
{
  "query": "customer renewal rates",
  "top_k": 5,
  "memory_type": "fact",
  "min_importance": 0.5
}

// Output
[
  {
    "memory_id": "a1b2c3d4-...",
    "content": "Enterprise customers have a 95% renewal rate in Q4",
    "memory_type": "fact",
    "importance": 0.9,
    "similarity_score": 0.94,
    "metadata": {"source": "quarterly-review", "quarter": "Q4-2025"},
    "created_at": "2025-12-15T10:30:00Z"
  },
  {
    "memory_id": "b2c3d4e5-...",
    "content": "SMB segment renewal rate dropped to 72% in Q3",
    "memory_type": "fact",
    "importance": 0.8,
    "similarity_score": 0.87,
    "metadata": {"source": "churn-analysis"},
    "created_at": "2025-10-01T14:20:00Z"
  }
]

When vector search is available, similarity_score is cosine similarity. In fallback mode, it is a heuristic relevance score.

`get_agent_state`

Retrieve a named state value for the current agent.

Parameters:

Name	Type	Required	Description
`key`	string	Yes	State key name

Returns: State key-value pair with timestamp.

Example:

// Input
{"key": "current_task"}

// Output
{
  "key": "current_task",
  "value": "Analyzing Q4 revenue data for the board presentation",
  "updated_at": "2025-12-15T14:20:00Z"
}

Agent state is stored in the _hatidata_agent_state table, scoped by agent_id.

`set_agent_state`

Set or update a named state value for the current agent.

Parameters:

Name	Type	Required	Description
`key`	string	Yes	State key name
`value`	string	Yes	State value

Returns: Confirmation.

Example:

// Input
{
  "key": "current_task",
  "value": "Generating the final report"
}

// Output
{"status": "ok", "key": "current_task"}

`delete_memory`

Delete a specific memory entry by ID. This operation is irreversible -- it removes both the metadata record and the vector embedding.

Parameters:

Name	Type	Required	Description
`memory_id`	string	Yes	UUID of the memory to delete

Returns: Deletion confirmation.

Example:

// Input
{"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}

// Output
{"status": "deleted", "memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}

Chain-of-Thought Tools (3)

Tools for recording and replaying agent reasoning traces. See Chain-of-Thought Ledger for architecture details.

`log_reasoning_step`

Log a single step in an agent's reasoning chain. Steps are hash-chained (SHA-256) per session for tamper detection.

Parameters:

Name	Type	Required	Description
`session_id`	string	Yes	Session identifier for the reasoning chain
`step_type`	string	Yes	One of: `observation`, `thought`, `action`, `result`, `reflection`, `planning`, `tool_call`, `tool_result`, `decision`, `error`, `correction`, `summary`
`content`	string	Yes	Content of the reasoning step
`metadata`	object	No	Additional context (tool names, confidence scores, etc.)
`importance`	float	No	Step importance for embedding sampling (default: `0.5`)

Returns: Step confirmation with hash chain data.

Example:

// Input
{
  "session_id": "analysis-2025-12-15",
  "step_type": "thought",
  "content": "Revenue is up 15% but concentrated in enterprise segment. Need to check SMB trends.",
  "metadata": {"confidence": 0.85},
  "importance": 0.7
}

// Output
{
  "trace_id": "f1e2d3c4-...",
  "step_number": 3,
  "hash": "a7b8c9d0e1f2...",
  "prev_hash": "1234abcd5678...",
  "session_id": "analysis-2025-12-15"
}

Steps with step_type of decision, error, or correction are always embedded for semantic search regardless of the sampling rate.

`replay_decision`

Replay the full reasoning chain for a specific session, including hash verification.

Parameters:

Name	Type	Required	Description
`session_id`	string	Yes	Session identifier to replay
`verify_chain`	boolean	No	Verify SHA-256 hash chain integrity (default: `true`)

Returns: Array of reasoning steps with chain verification status.

Example:

// Input
{"session_id": "analysis-2025-12-15", "verify_chain": true}

// Output
{
  "session_id": "analysis-2025-12-15",
  "chain_valid": true,
  "step_count": 8,
  "steps": [
    {
      "step_number": 0,
      "step_type": "observation",
      "content": "User asked about Q4 revenue trends",
      "hash": "1234abcd...",
      "prev_hash": null,
      "timestamp": "2025-12-15T10:00:00Z"
    },
    {
      "step_number": 1,
      "step_type": "planning",
      "content": "Plan: 1) Query revenue by segment 2) Compare Q3 vs Q4 3) Identify drivers",
      "hash": "5678efgh...",
      "prev_hash": "1234abcd...",
      "timestamp": "2025-12-15T10:00:01Z"
    }
  ]
}

If chain_valid is false, the hash chain has been tampered with. The response will indicate which step has an invalid hash.

`get_session_history`

List reasoning sessions for the current agent, with optional filtering.

Parameters:

Name	Type	Required	Description
`limit`	integer	No	Maximum sessions to return (default: `20`)
`agent_id`	string	No	Filter by agent (default: current agent)
`since`	string	No	ISO 8601 timestamp to filter sessions after

Returns: Array of session summaries.

Example:

// Input
{"limit": 5}

// Output
[
  {
    "session_id": "analysis-2025-12-15",
    "step_count": 8,
    "first_step_at": "2025-12-15T10:00:00Z",
    "last_step_at": "2025-12-15T10:05:30Z",
    "step_types": ["observation", "planning", "tool_call", "tool_result", "thought", "decision", "summary"]
  },
  {
    "session_id": "report-2025-12-14",
    "step_count": 12,
    "first_step_at": "2025-12-14T15:30:00Z",
    "last_step_at": "2025-12-14T15:42:00Z",
    "step_types": ["observation", "thought", "action", "result", "reflection", "summary"]
  }
]

Trigger Tools (4)

Tools for managing semantic triggers -- event-driven automations that fire when content matches a concept. See Semantic Triggers for architecture details.

`register_trigger`

Parameters:

Name	Type	Required	Description
`name`	string	Yes	Human-readable trigger name
`concept`	string	Yes	Natural language concept to match against
`threshold`	float	No	Cosine similarity threshold (default: `0.8`)
`action`	object	Yes	Action to take when triggered
`cooldown_seconds`	integer	No	Minimum seconds between firings (default: `60`)

Action types:

Action Type	Fields	Description
`webhook`	`url`, `headers`	Send HMAC-SHA256 signed POST to URL
`agent_notify`	`agent_id`, `message`	Add to agent's offline notification inbox
`write_event`	`event_type`, `payload`	Write a structured event to the audit log
`flag_for_review`	`reason`, `priority`	Flag content for human review in the dashboard

Returns: Trigger registration confirmation.

Example:

// Input
{
  "name": "fraud-pattern-detector",
  "concept": "unusual transaction pattern indicative of fraud or money laundering",
  "threshold": 0.85,
  "action": {
    "type": "webhook",
    "url": "https://alerts.example.com/fraud",
    "headers": {"X-Source": "hatidata-trigger"}
  },
  "cooldown_seconds": 300
}

// Output
{
  "trigger_id": "trig_a1b2c3d4",
  "name": "fraud-pattern-detector",
  "status": "active"
}

`list_triggers`

List all registered triggers for the current organization.

Parameters:

Name	Type	Required	Description
`status`	string	No	Filter by status: `"active"`, `"paused"`, `"all"` (default: `"active"`)

Returns: Array of trigger objects.

Example:

// Input
{"status": "active"}

// Output
[
  {
    "trigger_id": "trig_a1b2c3d4",
    "name": "fraud-pattern-detector",
    "concept": "unusual transaction pattern indicative of fraud or money laundering",
    "threshold": 0.85,
    "action": {"type": "webhook", "url": "https://alerts.example.com/fraud"},
    "cooldown_seconds": 300,
    "status": "active",
    "last_fired_at": "2025-12-14T08:30:00Z",
    "fire_count": 7
  }
]

`delete_trigger`

Delete a registered trigger.

Parameters:

Name	Type	Required	Description
`trigger_id`	string	Yes	ID of the trigger to delete

Returns: Deletion confirmation.

Example:

// Input
{"trigger_id": "trig_a1b2c3d4"}

// Output
{"status": "deleted", "trigger_id": "trig_a1b2c3d4"}

`test_trigger`

Test a trigger against sample content without actually firing the action.

Parameters:

Name	Type	Required	Description
`trigger_id`	string	Yes	ID of the trigger to test
`content`	string	Yes	Sample content to evaluate

Returns: Evaluation result with similarity score.

Example:

// Input
{
  "trigger_id": "trig_a1b2c3d4",
  "content": "Multiple high-value wire transfers to newly created offshore accounts within 24 hours"
}

// Output
{
  "trigger_id": "trig_a1b2c3d4",
  "would_fire": true,
  "similarity_score": 0.92,
  "threshold": 0.85,
  "action_type": "webhook"
}

Branch Tools (5)

Tools for creating isolated data branches for speculative agent operations. See State Branching for architecture details.

`branch_create`

Create a new isolated branch from the main data. Branches use DuckDB schema-based isolation with zero-copy views on creation.

Parameters:

Name	Type	Required	Description
`name`	string	No	Human-readable branch name
`description`	string	No	Purpose of the branch
`ttl_seconds`	integer	No	Auto-expire after N seconds (default: `3600`)

Returns: Branch creation confirmation.

Example:

// Input
{
  "name": "what-if-pricing-change",
  "description": "Simulate the effect of a 10% price increase on enterprise tier",
  "ttl_seconds": 7200
}

// Output
{
  "branch_id": "br_a1b2c3d4",
  "name": "what-if-pricing-change",
  "schema": "branch_a1b2c3d4",
  "status": "active",
  "expires_at": "2025-12-15T12:00:00Z"
}

`branch_query`

Execute a SQL query within a branch's isolated schema.

Parameters:

Name	Type	Required	Description
`branch_id`	string	Yes	Branch to query
`sql`	string	Yes	SQL query to execute

Returns: Array of row objects (same format as query).

Example:

// Input
{
  "branch_id": "br_a1b2c3d4",
  "sql": "UPDATE products SET price = price * 1.10 WHERE tier = 'enterprise'; SELECT tier, AVG(price) as avg_price FROM products GROUP BY tier;"
}

// Output
[
  {"tier": "enterprise", "avg_price": 1100.00},
  {"tier": "growth", "avg_price": 500.00},
  {"tier": "free", "avg_price": 0.00}
]

Writes within a branch are copy-on-write -- the first write to a table materializes a copy from main. Subsequent queries see the modified data without affecting the main schema.

`branch_merge`

Merge a branch's changes back into the main schema.

Parameters:

Name	Type	Required	Description
`branch_id`	string	Yes	Branch to merge
`strategy`	string	No	Conflict resolution: `"branch_wins"`, `"main_wins"`, `"abort"` (default: `"branch_wins"`)

Returns: Merge result with conflict details.

Example:

// Input
{
  "branch_id": "br_a1b2c3d4",
  "strategy": "branch_wins"
}

// Output
{
  "status": "merged",
  "branch_id": "br_a1b2c3d4",
  "tables_merged": ["products"],
  "conflicts": 0
}

If conflicts are detected and the strategy is "abort", no changes are applied and the conflicts are returned for review.

`branch_discard`

Discard a branch and clean up its isolated schema.

Parameters:

Name	Type	Required	Description
`branch_id`	string	Yes	Branch to discard

Returns: Discard confirmation.

Example:

// Input
{"branch_id": "br_a1b2c3d4"}

// Output
{"status": "discarded", "branch_id": "br_a1b2c3d4"}

`branch_list`

List all active branches for the current agent/organization.

Parameters:

Name	Type	Required	Description
`status`	string	No	Filter: `"active"`, `"merged"`, `"discarded"`, `"all"` (default: `"active"`)

Returns: Array of branch summaries.

Example:

// Input
{"status": "active"}

// Output
[
  {
    "branch_id": "br_a1b2c3d4",
    "name": "what-if-pricing-change",
    "description": "Simulate the effect of a 10% price increase on enterprise tier",
    "status": "active",
    "materialized_tables": ["products"],
    "created_at": "2025-12-15T10:00:00Z",
    "expires_at": "2025-12-15T12:00:00Z"
  }
]

Error Handling

All tools return errors in a consistent format:

{
  "error": {
    "code": "POLICY_VIOLATION",
    "message": "Agent 'analyst' is not permitted to execute DDL statements",
    "request_id": "req_a1b2c3d4"
  }
}

Common error codes:

Code	Description
`QUERY_ERROR`	SQL syntax error or execution failure
`POLICY_VIOLATION`	ABAC policy blocked the operation
`QUOTA_EXCEEDED`	Agent has exceeded its credit quota
`NOT_FOUND`	Requested resource (memory, branch, trigger) does not exist
`BRANCH_EXPIRED`	Branch TTL has expired
`CHAIN_INVALID`	CoT hash chain verification failed

Next Steps

MCP Server -- Installation and configuration
Agent Memory -- Memory architecture
Chain-of-Thought Ledger -- Reasoning trace architecture
Semantic Triggers -- Trigger evaluation pipeline
State Branching -- Branch isolation and merging

SQL Tools (7)​

query​

list_tables​

describe_table​

list_schemas​

read_query​

get_usage_stats​

run_sql_arrow​

Memory Tools (5)​

store_memory​

search_memory​

get_agent_state​

set_agent_state​

delete_memory​

Chain-of-Thought Tools (3)​

log_reasoning_step​

replay_decision​

get_session_history​

Trigger Tools (4)​

register_trigger​

list_triggers​

delete_trigger​

test_trigger​

Branch Tools (5)​

branch_create​

branch_query​

branch_merge​

branch_discard​

branch_list​

Error Handling​

Next Steps​

Stay in the loop

SQL Tools (7)

`query`

`list_tables`

`describe_table`

`list_schemas`

`read_query`

`get_usage_stats`

`run_sql_arrow`

Memory Tools (5)

`store_memory`

`search_memory`

`get_agent_state`

`set_agent_state`

`delete_memory`

Chain-of-Thought Tools (3)

`log_reasoning_step`

`replay_decision`

`get_session_history`

Trigger Tools (4)

`register_trigger`

`list_triggers`

`delete_trigger`

`test_trigger`

Branch Tools (5)

`branch_create`

`branch_query`

`branch_merge`

`branch_discard`

`branch_list`

Error Handling

Next Steps