Chain-of-Thought Ledger
The Chain-of-Thought (CoT) Ledger records every reasoning step an agent takes in an immutable, hash-chained format — creating a tamper-evident audit trail that can be replayed, verified, and analyzed months later.
Why Agents Need Reasoning Traces
When AI agents make decisions that affect business operations, you need answers:
- What did the agent reason about? — Full trace of observations, hypotheses, tool calls, and conclusions
- Can the trace be trusted? — Cryptographic hash chains prove no steps were inserted, deleted, or modified
- What went wrong? — Replay a session to understand failures or unexpected decisions
- Is the agent improving? — Compare reasoning patterns across sessions
How It Works
Agent → log_reasoning_step → CoT Ledger
├── Append to _hatidata_cot table
├── Update per-session hash chain
└── Conditionally dispatch for embedding
Analyst → replay_decision → CoT Ledger
├── Load session trace
└── Verify hash chain integrity
Hash Chaining
Each step's hash is computed from its content and the previous step's hash, forming a cryptographic hash chain. Modifying any step invalidates all subsequent hashes, making tampering detectable.
Append-Only Enforcement
An append-only enforcer intercepts DML statements targeting _hatidata_cot tables:
- INSERT — Allowed (append new steps)
- UPDATE — Blocked
- DELETE — Blocked
- TRUNCATE — Blocked
- DROP — Blocked
Enforcement happens at the proxy level, before query execution. Even direct SQL connections cannot tamper with the ledger.
Trace Schema
Each reasoning step is stored with 18 fields:
| Field | Type | Description |
|---|---|---|
trace_id | UUID | Unique identifier for this trace entry |
org_id | VARCHAR | Organization identifier |
agent_id | VARCHAR | Agent that produced this step |
session_id | VARCHAR | Groups steps into a reasoning session |
step_index | INTEGER | Ordinal position within the session (0-based) |
step_type | VARCHAR | One of 12 step type variants |
content | TEXT | The reasoning content |
input_data | JSON | Input to this step (tool arguments, observations) |
output_data | JSON | Output from this step (tool results, decisions) |
confidence | FLOAT | Agent's self-reported confidence (0.0 to 1.0) |
duration_ms | BIGINT | How long this step took |
token_count | INTEGER | Token usage for LLM-based steps |
model | VARCHAR | Model identifier (e.g., gpt-4o, claude-3.5-sonnet) |
metadata | JSON | Arbitrary additional context |
prev_hash | VARCHAR | Cryptographic hash of the previous step |
current_hash | VARCHAR | Cryptographic hash of this step |
has_embedding | BOOLEAN | Whether an embedding exists |
created_at | TIMESTAMP | When this step was recorded |
Step Types
| Step Type | Description | Always Embedded? |
|---|---|---|
Observation | Agent observes data or environment state | No |
Hypothesis | Agent forms a hypothesis | No |
ToolCall | Agent invokes an external tool | No |
ToolResult | Result returned from a tool call | No |
Reasoning | Internal reasoning or analysis | No |
Decision | Agent makes a decision | Yes |
Action | Agent takes an action | Yes |
Error | An error occurred during reasoning | Yes |
Correction | Agent corrects a previous step | Yes |
Summary | Agent summarizes findings | No |
PlanStep | A step in a multi-step plan | No |
FinalAnswer | The final output of the reasoning session | Yes |
Architecture
Write Path
The write path manages per-session hash chains:
- Tracks the latest hash per
session_idfor chain continuity - On each
log_reasoning_step, computes the new hash, appends the row, and updates the chain head - Handles concurrent sessions safely via lock-free operations
Read Path
The read path provides retrieval and verification:
| Operation | Description |
|---|---|
| Replay session | Load all steps for a session in order |
| Get trace | Load a single trace entry by ID |
| List traces | List recent sessions for an agent |
| Verify chain | Verify hash chain integrity for a session |
Embedding Sampling
Not every step needs a vector embedding. The system uses configurable sampling:
- Sampling rate — Default 10% of steps are embedded
- Critical step types —
Decision,Action,Error,Correction, andFinalAnswerare always embedded - Purpose — Embedded steps are findable via semantic search (e.g., "find all decisions about pricing")
MCP Tools
log_reasoning_step
// Input
{
"session_id": "analysis-session-42",
"step_type": "Reasoning",
"content": "The revenue data shows a clear upward trend in Q4, driven primarily by enterprise.",
"input_data": { "query": "SELECT segment, SUM(revenue) FROM orders WHERE quarter = 'Q4' GROUP BY 1" },
"output_data": { "enterprise": 4500000, "mid_market": 2100000, "smb": 890000 },
"confidence": 0.92,
"model": "gpt-4o"
}
// Output
{
"trace_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"step_index": 3,
"current_hash": "a3f2b8c9d4e5f6..."
}
replay_decision
// Input
{ "session_id": "analysis-session-42", "verify_chain": true }
// Output
{
"session_id": "analysis-session-42",
"agent_id": "analyst-agent",
"step_count": 7,
"chain_valid": true,
"steps": [
{ "step_index": 0, "step_type": "Observation", "content": "User requested Q4 revenue analysis..." },
{ "step_index": 1, "step_type": "ToolCall", "content": "Querying orders table..." }
]
}
get_session_history
// Input
{ "agent_id": "analyst-agent", "limit": 20 }
// Output
[{
"session_id": "analysis-session-42",
"step_count": 7,
"first_step_at": "2025-01-15T10:30:00Z",
"last_step_at": "2025-01-15T10:30:15Z",
"chain_valid": true
}]
Usage Example
from hatidata_agent import HatiDataAgent
agent = HatiDataAgent(
host="your-org.proxy.hatidata.com",
agent_id="analyst",
password="hd_live_your_api_key",
)
session_id = "revenue-analysis-q4"
agent.log_reasoning_step(
session_id=session_id,
step_type="Observation",
content="User asked for Q4 revenue breakdown by segment",
)
agent.log_reasoning_step(
session_id=session_id,
step_type="ToolCall",
content="Querying orders table",
input_data={"sql": "SELECT segment, SUM(revenue) FROM orders WHERE quarter='Q4' GROUP BY 1"},
)
agent.log_reasoning_step(
session_id=session_id,
step_type="Reasoning",
content="Enterprise grew 23% QoQ while SMB declined 5%.",
confidence=0.88,
)
agent.log_reasoning_step(
session_id=session_id,
step_type="FinalAnswer",
content="Q4 revenue was $7.49M, up 15% QoQ. Enterprise drove the growth.",
confidence=0.95,
)
# Later: replay and verify
trace = agent.replay_decision(session_id, verify_chain=True)
assert trace["chain_valid"] is True
Configuration
CoT Ledger behavior is configurable per deployment:
| Setting | Default | Description |
|---|---|---|
| CoT enabled | true | Enable/disable the CoT ledger |
| Embedding sample rate | 10% | Fraction of steps to embed (0.0 to 1.0) |
| Max content length | 64 KB | Maximum content length per step |
| Retention period | 90 days | How long to retain CoT records |
Related Concepts
- Persistent Memory — Long-term context storage
- Semantic Triggers — Triggers can evaluate reasoning patterns
- Branch Isolation — Track reasoning across branch operations
- Audit Guarantees — Enterprise audit and retention