Skip to main content

Branch Isolation

Branch isolation allows agents to create isolated copies of their data environment, run speculative queries or writes, and then merge changes back or discard them — like Git branches, but for data state.

Why Agents Need Branching

Agents frequently need to explore "what-if" scenarios without risking production data:

  • Scenario planning — Model pricing changes, market shifts, or demand forecasts
  • Safe experimentation — Write experimental transformations without risk
  • A/B testing — Run the same query against original vs. modified data
  • Multi-agent collaboration — One agent creates a branch, another reviews, a third approves the merge

How It Works

Each branch is an isolated environment that starts as zero-copy references pointing to main tables. On first write, the target table is materialized (copy-on-write).

Main (production data)
├── customers (real table)
├── orders (real table)
└── products (real table)

Branch branch_abc123
├── customers → reference → main.customers (zero-copy)
├── orders → MATERIALIZED COPY (written to)
└── products → reference → main.products (zero-copy)

Zero-Copy References

Branch creation is instant regardless of data size. All tables in the branch initially reference the main tables directly — no data is copied.

Copy-on-Write Materialization

When an agent writes to a table in a branch:

  1. The reference is replaced with a full copy of the table
  2. The write operation is applied to the materialized copy
  3. All subsequent reads/writes go to the materialized copy

Only modified tables are materialized. Read-only tables remain as zero-copy references.

Architecture

Branch Lifecycle

The branching system coordinates the full lifecycle: create → query/write → merge OR discard

Operations:

OperationDescription
CreateSet up isolated environment with zero-copy references for all main tables
MaterializeConvert a reference to a real table on first write (copy-on-write)
QueryExecute a read query within the branch
WriteExecute a write, materializing the target table if needed
DiscardRemove the branch and all its contents

The system tracks:

  • Active branches and their state (active, merging, discarded)
  • Materialized tables per branch
  • Reference counts for concurrent access
  • TTL expiration for abandoned branches

Merge and Conflict Resolution

When merging a branch back to main, the system detects conflicts by comparing the current state of main tables against their state at branch creation time. If a table has changed in main since the branch was created, it is flagged as a conflict.

Merge Strategies:

StrategySDK aliasBehavior
oursbranch_winsBranch data overwrites main for conflicting tables
theirsmain_winsMain preserved; branch changes to conflicting tables discarded
fail_on_conflictabortReturns conflict list; merge aborted if any conflicts exist (default)

Garbage Collection

  • Reference counting per branch — active queries increment, completion decrements
  • TTL expiration (default: 1 hour) — zero-ref branches past TTL are cleaned up
  • Periodic cleanup (default: every 5 minutes)

MCP Tools

branch_create

// Input
{ "name": "q4-revenue-simulation", "description": "Simulate 10% price increase on Q4 revenue" }

// Output
{ "branch_id": "branch_a1b2c3d4", "tables_linked": 12, "created_at": "2025-01-15T10:30:00Z" }

branch_query

// Input
{ "branch_id": "branch_a1b2c3d4", "sql": "SELECT segment, SUM(revenue * 1.10) as projected FROM orders GROUP BY 1" }

// Output
{ "columns": ["segment", "projected"], "rows": [["enterprise", 4950000.00], ["mid_market", 2310000.00]] }

branch_merge

// Input
{ "branch_id": "branch_a1b2c3d4", "strategy": "ours" }

// Output (no conflicts)
{ "status": "merged", "tables_merged": 1, "tables_skipped": 11, "conflicts": [] }

// Output (conflicts with fail_on_conflict)
{ "status": "conflicts_detected", "conflicts": [{ "table": "orders", "branch_rows": 15420, "main_rows": 15380, "rows_diverged": 40 }] }

branch_discard

// Input
{ "branch_id": "branch_a1b2c3d4" }

// Output
{ "status": "discarded", "tables_dropped": 1, "views_dropped": 11 }

branch_list

// Input
{ "include_expired": false }

// Output
[{
"branch_id": "branch_a1b2c3d4",
"name": "q4-revenue-simulation",
"materialized_tables": ["orders"],
"ref_count": 0,
"created_at": "2025-01-15T10:30:00Z",
"expires_at": "2025-01-15T11:30:00Z"
}]

Usage Example

from hatidata_agent import HatiDataAgent

agent = HatiDataAgent(
host="your-org.proxy.hatidata.com",
agent_id="simulation-agent",
password="hd_live_your_api_key",
)

# Create branch
branch = agent.branch_create(
name="pricing-experiment",
description="Test 15% price increase on enterprise",
)
branch_id = branch["branch_id"]

# Modify data in branch (triggers copy-on-write)
agent.branch_query(branch_id, """
UPDATE orders SET total = total * 1.15
WHERE segment = 'enterprise' AND quarter = 'Q4'
""")

# Analyze results
result = agent.branch_query(branch_id, """
SELECT segment, SUM(total) as revenue, AVG(total) as avg_order
FROM orders WHERE quarter = 'Q4'
GROUP BY segment ORDER BY revenue DESC
""")
print("Projected:", result)

# Compare with unmodified main
main_result = agent.query("""
SELECT segment, SUM(total) as revenue
FROM orders WHERE quarter = 'Q4' GROUP BY segment
""")
print("Current:", main_result)

# Discard (was just for analysis)
agent.branch_discard(branch_id)

Configuration

Branching behavior is configurable per deployment:

SettingDefaultDescription
Branching enabledtrueEnable/disable branching
Branch TTL1 hourDefault branch time-to-live
Max branches per org50Max concurrent branches per organization
GC interval5 minutesHow often expired branches are cleaned up
Max materialized data per branch4 GBStorage cap per branch

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.