Branch Isolation
Branch isolation allows agents to create isolated copies of their data environment, run speculative queries or writes, and then merge changes back or discard them — like Git branches, but for data state.
Why Agents Need Branching
Agents frequently need to explore "what-if" scenarios without risking production data:
- Scenario planning — Model pricing changes, market shifts, or demand forecasts
- Safe experimentation — Write experimental transformations without risk
- A/B testing — Run the same query against original vs. modified data
- Multi-agent collaboration — One agent creates a branch, another reviews, a third approves the merge
How It Works
Each branch is an isolated environment that starts as zero-copy references pointing to main tables. On first write, the target table is materialized (copy-on-write).
Main (production data)
├── customers (real table)
├── orders (real table)
└── products (real table)
Branch branch_abc123
├── customers → reference → main.customers (zero-copy)
├── orders → MATERIALIZED COPY (written to)
└── products → reference → main.products (zero-copy)
Zero-Copy References
Branch creation is instant regardless of data size. All tables in the branch initially reference the main tables directly — no data is copied.
Copy-on-Write Materialization
When an agent writes to a table in a branch:
- The reference is replaced with a full copy of the table
- The write operation is applied to the materialized copy
- All subsequent reads/writes go to the materialized copy
Only modified tables are materialized. Read-only tables remain as zero-copy references.
Architecture
Branch Lifecycle
The branching system coordinates the full lifecycle: create → query/write → merge OR discard
Operations:
| Operation | Description |
|---|---|
| Create | Set up isolated environment with zero-copy references for all main tables |
| Materialize | Convert a reference to a real table on first write (copy-on-write) |
| Query | Execute a read query within the branch |
| Write | Execute a write, materializing the target table if needed |
| Discard | Remove the branch and all its contents |
The system tracks:
- Active branches and their state (active, merging, discarded)
- Materialized tables per branch
- Reference counts for concurrent access
- TTL expiration for abandoned branches
Merge and Conflict Resolution
When merging a branch back to main, the system detects conflicts by comparing the current state of main tables against their state at branch creation time. If a table has changed in main since the branch was created, it is flagged as a conflict.
Merge Strategies:
| Strategy | SDK alias | Behavior |
|---|---|---|
ours | branch_wins | Branch data overwrites main for conflicting tables |
theirs | main_wins | Main preserved; branch changes to conflicting tables discarded |
fail_on_conflict | abort | Returns conflict list; merge aborted if any conflicts exist (default) |
Garbage Collection
- Reference counting per branch — active queries increment, completion decrements
- TTL expiration (default: 1 hour) — zero-ref branches past TTL are cleaned up
- Periodic cleanup (default: every 5 minutes)
MCP Tools
branch_create
// Input
{ "name": "q4-revenue-simulation", "description": "Simulate 10% price increase on Q4 revenue" }
// Output
{ "branch_id": "branch_a1b2c3d4", "tables_linked": 12, "created_at": "2025-01-15T10:30:00Z" }
branch_query
// Input
{ "branch_id": "branch_a1b2c3d4", "sql": "SELECT segment, SUM(revenue * 1.10) as projected FROM orders GROUP BY 1" }
// Output
{ "columns": ["segment", "projected"], "rows": [["enterprise", 4950000.00], ["mid_market", 2310000.00]] }
branch_merge
// Input
{ "branch_id": "branch_a1b2c3d4", "strategy": "ours" }
// Output (no conflicts)
{ "status": "merged", "tables_merged": 1, "tables_skipped": 11, "conflicts": [] }
// Output (conflicts with fail_on_conflict)
{ "status": "conflicts_detected", "conflicts": [{ "table": "orders", "branch_rows": 15420, "main_rows": 15380, "rows_diverged": 40 }] }
branch_discard
// Input
{ "branch_id": "branch_a1b2c3d4" }
// Output
{ "status": "discarded", "tables_dropped": 1, "views_dropped": 11 }
branch_list
// Input
{ "include_expired": false }
// Output
[{
"branch_id": "branch_a1b2c3d4",
"name": "q4-revenue-simulation",
"materialized_tables": ["orders"],
"ref_count": 0,
"created_at": "2025-01-15T10:30:00Z",
"expires_at": "2025-01-15T11:30:00Z"
}]
Usage Example
from hatidata_agent import HatiDataAgent
agent = HatiDataAgent(
host="your-org.proxy.hatidata.com",
agent_id="simulation-agent",
password="hd_live_your_api_key",
)
# Create branch
branch = agent.branch_create(
name="pricing-experiment",
description="Test 15% price increase on enterprise",
)
branch_id = branch["branch_id"]
# Modify data in branch (triggers copy-on-write)
agent.branch_query(branch_id, """
UPDATE orders SET total = total * 1.15
WHERE segment = 'enterprise' AND quarter = 'Q4'
""")
# Analyze results
result = agent.branch_query(branch_id, """
SELECT segment, SUM(total) as revenue, AVG(total) as avg_order
FROM orders WHERE quarter = 'Q4'
GROUP BY segment ORDER BY revenue DESC
""")
print("Projected:", result)
# Compare with unmodified main
main_result = agent.query("""
SELECT segment, SUM(total) as revenue
FROM orders WHERE quarter = 'Q4' GROUP BY segment
""")
print("Current:", main_result)
# Discard (was just for analysis)
agent.branch_discard(branch_id)
Configuration
Branching behavior is configurable per deployment:
| Setting | Default | Description |
|---|---|---|
| Branching enabled | true | Enable/disable branching |
| Branch TTL | 1 hour | Default branch time-to-live |
| Max branches per org | 50 | Max concurrent branches per organization |
| GC interval | 5 minutes | How often expired branches are cleaned up |
| Max materialized data per branch | 4 GB | Storage cap per branch |
Related Concepts
- Persistent Memory — Memories accessible from branches
- Chain-of-Thought Ledger — Track reasoning across branches
- Agent Identity Model — Control which agents can branch and merge