Tasks & Attempts
HatiData V2 introduces a strict separation between intent (what an agent should do) and execution (what actually happened). This is the foundation of the Governed Runtime.
The Core Distinction
| Concept | What It Represents | Lifecycle |
|---|---|---|
| Task | The goal: "Generate architecture for project X" | Created once, never modified |
| Attempt | A single execution run of that task | Created per retry, tracks state transitions |
A Task may have multiple Attempts — each retry, fallback, or repair creates a new Attempt. This separation enables:
- Forensic debugging: compare Attempt #1 (failed) vs Attempt #2 (succeeded)
- Cost attribution: each Attempt tracks its own model decisions and token usage
- Recovery lineage: trace the chain from initial failure through repair to resolution
Task Lifecycle
+-----------+
| Created |
+-----+-----+
|
+-----v-----+
| Queued |
+-----+-----+
|
+-----v-----+
+--->| Running |<---+
| +-----+-----+ |
| | |
(recovery) +---v---+ (recovery)
| | Done | |
| +---+---+ |
| | |
| +-----v-----+ |
+----| Failed |---+
+-----------+
A Task starts as Created, moves to Queued when ready for dispatch, then Running when an agent claims it. It completes as Done or Failed — and failed tasks can trigger recovery (new Attempts).
Attempt State Machine
Each Attempt tracks fine-grained execution state:
| State | Description | Terminal? |
|---|---|---|
queued | In the task queue, not yet leased | No |
preflight_blocked | Required inputs missing | No |
context_resolved | Dependencies loaded | No |
model_selected | LLM routing decision made | No |
cache_checked | Prompt cache lookup done | No |
executing | LLM call in progress | No |
artifact_validating | Output schema validation | No |
evidence_collecting | Waiting for runtime evidence | No |
verifier_pending | Verifier dispatched | No |
completed_verified | Artifact validated + evidence confirmed | Yes |
retryable_failed | Eligible for L1/L2 recovery | No |
repair_required | L3 RepairAgent needed | No |
terminal_failed | No recovery path | Yes |
HatiData enforces a partial unique index: only one Attempt per Task can be in a non-terminal state. This prevents double-processing — a task cannot have two agents working on it simultaneously.
Leases and Heartbeats
When an agent claims a Task, HatiData issues a lease — a time-bounded lock that prevents other agents from claiming the same work.
Agent claims task → Lease issued (3-min TTL)
│
┌─────┴─────┐
│ Heartbeat │ ← Agent sends every 60s
│ renews │
│ lease │
└─────┬─────┘
│
Lease expires → Task returned to queue
Key properties:
- TTL: 3 minutes (configurable per task class)
- Heartbeat: Agent sends every 60 seconds to renew
- Expiry: If no heartbeat, lease expires and task returns to
queued - Atomic claim:
claim_next_taskusesSELECT ... FOR UPDATE SKIP LOCKEDto prevent races
Recovery Levels
When an Attempt fails, the recovery coordinator determines the next action:
| Level | Action | When |
|---|---|---|
| L1 Retry | Same model, same prompt | Transient errors (timeout, rate limit) |
| L2 Escalation | Switch to stronger model | Schema validation failure |
| L3 Repair | Dispatch RepairAgent | Build/test/deploy failure |
| L4 Human | Surface to review queue | Security blocker, policy denial |
| Blocked | No autonomous next step | Missing dependency, policy denied |
Recovery state accumulates — an L1 failure that persists escalates to L2, then L3. The recovery_actions table maintains the full chain.
API Endpoints
| Method | Path | Description |
|---|---|---|
GET | /v2/runtime/tasks/:id | Get task with all attempts |
GET | /v2/runtime/tasks/:id/attempts | List attempts for a task |
GET | /v2/runtime/attempts/:id | Get attempt detail |
POST | /v2/runtime/tasks/:id/claim | Claim next queued task |
POST | /v2/runtime/attempts/:id/heartbeat | Renew lease |
POST | /v2/runtime/attempts/:id/complete | Mark attempt completed |
POST | /v2/runtime/attempts/:id/fail | Mark attempt failed |
Schema
-- Tasks: the intent
CREATE TABLE hd_runtime.tasks (
id UUID PRIMARY KEY,
project_id UUID NOT NULL,
agent_type TEXT NOT NULL,
task_class TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'created',
input_hash TEXT,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
-- Attempts: the execution
CREATE TABLE hd_runtime.task_attempts (
id UUID PRIMARY KEY,
task_id UUID NOT NULL REFERENCES hd_runtime.tasks(id),
status TEXT NOT NULL DEFAULT 'queued',
agent_run_id TEXT,
model_class TEXT,
lease_token TEXT,
lease_expires TIMESTAMP,
failure_kind TEXT,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
completed_at TIMESTAMP
);
Next Steps
- Lineage & Explainability — How to trace an attempt back to its model decisions
- Entity Relationship Model — Visual map of all V2 entities