Chain-of-Thought Ledger

The Chain-of-Thought (CoT) Ledger records every reasoning step an agent takes in an immutable, hash-chained format — creating a tamper-evident audit trail that can be replayed, verified, and analyzed months later.

Why Agents Need Reasoning Traces

When AI agents make decisions that affect business operations, you need answers:

What did the agent reason about? — Full trace of observations, hypotheses, tool calls, and conclusions
Can the trace be trusted? — Cryptographic hash chains prove no steps were inserted, deleted, or modified
What went wrong? — Replay a session to understand failures or unexpected decisions
Is the agent improving? — Compare reasoning patterns across sessions

How It Works

Agent → log_reasoning_step → CoT Ledger
                               ├── Append to _hatidata_cot table
                               ├── Update per-session hash chain
                               └── Conditionally dispatch for embedding

Analyst → replay_decision → CoT Ledger
                              ├── Load session trace
                              └── Verify hash chain integrity

Hash Chaining

Each step's hash is computed from its content and the previous step's hash, forming a cryptographic hash chain. Modifying any step invalidates all subsequent hashes, making tampering detectable.

Append-Only Enforcement

An append-only enforcer intercepts DML statements targeting _hatidata_cot tables:

INSERT — Allowed (append new steps)
UPDATE — Blocked
DELETE — Blocked
TRUNCATE — Blocked
DROP — Blocked

Enforcement happens at the proxy level, before query execution. Even direct SQL connections cannot tamper with the ledger.

Trace Schema

Each reasoning step is stored with 18 fields:

Field	Type	Description
`trace_id`	UUID	Unique identifier for this trace entry
`org_id`	VARCHAR	Organization identifier
`agent_id`	VARCHAR	Agent that produced this step
`session_id`	VARCHAR	Groups steps into a reasoning session
`step_index`	INTEGER	Ordinal position within the session (0-based)
`step_type`	VARCHAR	One of 12 step type variants
`content`	TEXT	The reasoning content
`input_data`	JSON	Input to this step (tool arguments, observations)
`output_data`	JSON	Output from this step (tool results, decisions)
`confidence`	FLOAT	Agent's self-reported confidence (0.0 to 1.0)
`duration_ms`	BIGINT	How long this step took
`token_count`	INTEGER	Token usage for LLM-based steps
`model`	VARCHAR	Model identifier (e.g., `gpt-4o`, `claude-3.5-sonnet`)
`metadata`	JSON	Arbitrary additional context
`prev_hash`	VARCHAR	Cryptographic hash of the previous step
`current_hash`	VARCHAR	Cryptographic hash of this step
`has_embedding`	BOOLEAN	Whether an embedding exists
`created_at`	TIMESTAMP	When this step was recorded

Step Types

Step Type	Description	Always Embedded?
`Observation`	Agent observes data or environment state	No
`Hypothesis`	Agent forms a hypothesis	No
`ToolCall`	Agent invokes an external tool	No
`ToolResult`	Result returned from a tool call	No
`Reasoning`	Internal reasoning or analysis	No
`Decision`	Agent makes a decision	Yes
`Action`	Agent takes an action	Yes
`Error`	An error occurred during reasoning	Yes
`Correction`	Agent corrects a previous step	Yes
`Summary`	Agent summarizes findings	No
`PlanStep`	A step in a multi-step plan	No
`FinalAnswer`	The final output of the reasoning session	Yes

Architecture

Write Path

The write path manages per-session hash chains:

Tracks the latest hash per session_id for chain continuity
On each log_reasoning_step, computes the new hash, appends the row, and updates the chain head
Handles concurrent sessions safely via lock-free operations

Read Path

The read path provides retrieval and verification:

Operation	Description
Replay session	Load all steps for a session in order
Get trace	Load a single trace entry by ID
List traces	List recent sessions for an agent
Verify chain	Verify hash chain integrity for a session

Embedding Sampling

Not every step needs a vector embedding. The system uses configurable sampling:

Sampling rate — Default 10% of steps are embedded
Critical step types — Decision, Action, Error, Correction, and FinalAnswer are always embedded
Purpose — Embedded steps are findable via semantic search (e.g., "find all decisions about pricing")

MCP Tools

`log_reasoning_step`

// Input
{
  "session_id": "analysis-session-42",
  "step_type": "Reasoning",
  "content": "The revenue data shows a clear upward trend in Q4, driven primarily by enterprise.",
  "input_data": { "query": "SELECT segment, SUM(revenue) FROM orders WHERE quarter = 'Q4' GROUP BY 1" },
  "output_data": { "enterprise": 4500000, "mid_market": 2100000, "smb": 890000 },
  "confidence": 0.92,
  "model": "gpt-4o"
}

// Output
{
  "trace_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "step_index": 3,
  "current_hash": "a3f2b8c9d4e5f6..."
}

`replay_decision`

// Input
{ "session_id": "analysis-session-42", "verify_chain": true }

// Output
{
  "session_id": "analysis-session-42",
  "agent_id": "analyst-agent",
  "step_count": 7,
  "chain_valid": true,
  "steps": [
    { "step_index": 0, "step_type": "Observation", "content": "User requested Q4 revenue analysis..." },
    { "step_index": 1, "step_type": "ToolCall", "content": "Querying orders table..." }
  ]
}

`get_session_history`

// Input
{ "agent_id": "analyst-agent", "limit": 20 }

// Output
[{
  "session_id": "analysis-session-42",
  "step_count": 7,
  "first_step_at": "2025-01-15T10:30:00Z",
  "last_step_at": "2025-01-15T10:30:15Z",
  "chain_valid": true
}]

Usage Example

from hatidata_agent import HatiDataAgent

agent = HatiDataAgent(
    host="your-org.proxy.hatidata.com",
    agent_id="analyst",
    password="hd_live_your_api_key",
)

session_id = "revenue-analysis-q4"

agent.log_reasoning_step(
    session_id=session_id,
    step_type="Observation",
    content="User asked for Q4 revenue breakdown by segment",
)

agent.log_reasoning_step(
    session_id=session_id,
    step_type="ToolCall",
    content="Querying orders table",
    input_data={"sql": "SELECT segment, SUM(revenue) FROM orders WHERE quarter='Q4' GROUP BY 1"},
)

agent.log_reasoning_step(
    session_id=session_id,
    step_type="Reasoning",
    content="Enterprise grew 23% QoQ while SMB declined 5%.",
    confidence=0.88,
)

agent.log_reasoning_step(
    session_id=session_id,
    step_type="FinalAnswer",
    content="Q4 revenue was $7.49M, up 15% QoQ. Enterprise drove the growth.",
    confidence=0.95,
)

# Later: replay and verify
trace = agent.replay_decision(session_id, verify_chain=True)
assert trace["chain_valid"] is True

Configuration

CoT Ledger behavior is configurable per deployment:

Setting	Default	Description
CoT enabled	`true`	Enable/disable the CoT ledger
Embedding sample rate	10%	Fraction of steps to embed (0.0 to 1.0)
Max content length	64 KB	Maximum content length per step
Retention period	90 days	How long to retain CoT records

Persistent Memory — Long-term context storage
Semantic Triggers — Triggers can evaluate reasoning patterns
Branch Isolation — Track reasoning across branch operations
Audit Guarantees — Enterprise audit and retention

Why Agents Need Reasoning Traces​

How It Works​

Hash Chaining​

Append-Only Enforcement​

Trace Schema​

Step Types​

Architecture​

Write Path​

Read Path​

Embedding Sampling​

MCP Tools​

log_reasoning_step​

replay_decision​

get_session_history​

Usage Example​

Configuration​

Related Concepts​

Stay in the loop