Skip to main content

Tasks & Attempts

HatiData V2 introduces a strict separation between intent (what an agent should do) and execution (what actually happened). This is the foundation of the Governed Runtime.

The Core Distinction

ConceptWhat It RepresentsLifecycle
TaskThe goal: "Generate architecture for project X"Created once, never modified
AttemptA single execution run of that taskCreated per retry, tracks state transitions

A Task may have multiple Attempts — each retry, fallback, or repair creates a new Attempt. This separation enables:

  • Forensic debugging: compare Attempt #1 (failed) vs Attempt #2 (succeeded)
  • Cost attribution: each Attempt tracks its own model decisions and token usage
  • Recovery lineage: trace the chain from initial failure through repair to resolution

Task Lifecycle

                    +-----------+
| Created |
+-----+-----+
|
+-----v-----+
| Queued |
+-----+-----+
|
+-----v-----+
+--->| Running |<---+
| +-----+-----+ |
| | |
(recovery) +---v---+ (recovery)
| | Done | |
| +---+---+ |
| | |
| +-----v-----+ |
+----| Failed |---+
+-----------+

A Task starts as Created, moves to Queued when ready for dispatch, then Running when an agent claims it. It completes as Done or Failed — and failed tasks can trigger recovery (new Attempts).

Attempt State Machine

Each Attempt tracks fine-grained execution state:

StateDescriptionTerminal?
queuedIn the task queue, not yet leasedNo
preflight_blockedRequired inputs missingNo
context_resolvedDependencies loadedNo
model_selectedLLM routing decision madeNo
cache_checkedPrompt cache lookup doneNo
executingLLM call in progressNo
artifact_validatingOutput schema validationNo
evidence_collectingWaiting for runtime evidenceNo
verifier_pendingVerifier dispatchedNo
completed_verifiedArtifact validated + evidence confirmedYes
retryable_failedEligible for L1/L2 recoveryNo
repair_requiredL3 RepairAgent neededNo
terminal_failedNo recovery pathYes
One Active Attempt Per Task

HatiData enforces a partial unique index: only one Attempt per Task can be in a non-terminal state. This prevents double-processing — a task cannot have two agents working on it simultaneously.

Leases and Heartbeats

When an agent claims a Task, HatiData issues a lease — a time-bounded lock that prevents other agents from claiming the same work.

Agent claims task → Lease issued (3-min TTL)

┌─────┴─────┐
│ Heartbeat │ ← Agent sends every 60s
│ renews │
│ lease │
└─────┬─────┘

Lease expires → Task returned to queue

Key properties:

  • TTL: 3 minutes (configurable per task class)
  • Heartbeat: Agent sends every 60 seconds to renew
  • Expiry: If no heartbeat, lease expires and task returns to queued
  • Atomic claim: claim_next_task uses SELECT ... FOR UPDATE SKIP LOCKED to prevent races

Recovery Levels

When an Attempt fails, the recovery coordinator determines the next action:

LevelActionWhen
L1 RetrySame model, same promptTransient errors (timeout, rate limit)
L2 EscalationSwitch to stronger modelSchema validation failure
L3 RepairDispatch RepairAgentBuild/test/deploy failure
L4 HumanSurface to review queueSecurity blocker, policy denial
BlockedNo autonomous next stepMissing dependency, policy denied

Recovery state accumulates — an L1 failure that persists escalates to L2, then L3. The recovery_actions table maintains the full chain.

API Endpoints

MethodPathDescription
GET/v2/runtime/tasks/:idGet task with all attempts
GET/v2/runtime/tasks/:id/attemptsList attempts for a task
GET/v2/runtime/attempts/:idGet attempt detail
POST/v2/runtime/tasks/:id/claimClaim next queued task
POST/v2/runtime/attempts/:id/heartbeatRenew lease
POST/v2/runtime/attempts/:id/completeMark attempt completed
POST/v2/runtime/attempts/:id/failMark attempt failed

Schema

-- Tasks: the intent
CREATE TABLE hd_runtime.tasks (
id UUID PRIMARY KEY,
project_id UUID NOT NULL,
agent_type TEXT NOT NULL,
task_class TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'created',
input_hash TEXT,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);

-- Attempts: the execution
CREATE TABLE hd_runtime.task_attempts (
id UUID PRIMARY KEY,
task_id UUID NOT NULL REFERENCES hd_runtime.tasks(id),
status TEXT NOT NULL DEFAULT 'queued',
agent_run_id TEXT,
model_class TEXT,
lease_token TEXT,
lease_expires TIMESTAMP,
failure_kind TEXT,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
completed_at TIMESTAMP
);

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.