Skip to main content

Branch-Based Exploration

In this tutorial you will use HatiData's branch isolation to let agents explore data safely without affecting production state. Branches create schema-level isolation with copy-on-write semantics -- zero-copy on creation, data is only duplicated when a branch writes to a table.

By the end you will have:

  • Created branches for safe agent exploration
  • Run queries and writes within a branch
  • Merged branch results back to main
  • Handled merge conflicts with different strategies

Prerequisites

  • Python 3.10+
  • HatiData proxy running locally or in the cloud
  • hatidata SDK installed
pip install hatidata
export HATIDATA_API_KEY="hd_live_your_api_key"
export HATIDATA_HOST="localhost"

Step 1: Understand Branch Architecture

HatiData branches use schema-based isolation:

main (default schema)
├── customers
├── orders
└── products

branch_abc123 (exploration branch)
├── customers → VIEW pointing to main.customers (zero-copy)
├── orders → VIEW pointing to main.orders (zero-copy)
└── products → MATERIALIZED (copy-on-write, modified by agent)

When a branch is created, every table in main gets a zero-copy view in the branch schema. The first time the agent writes to a table within the branch, that table is materialized as a full copy. Reads from unmodified tables still reference the original data.


Step 2: Create a Branch

import os
from hatidata import HatiDataClient

client = HatiDataClient(
host=os.environ["HATIDATA_HOST"],
port=5439,
api_key=os.environ["HATIDATA_API_KEY"],
)

# Create a branch for exploration
branch = client.branches.create(
name="pricing-experiment",
description="Testing new pricing tiers before applying to production",
agent_id="pricing-agent",
ttl_hours=24, # Auto-cleanup after 24 hours if not merged
)

print(f"Branch created: {branch.branch_id}")
print(f"Schema: {branch.schema_name}")

Step 3: Query Within a Branch

Queries within a branch see the branch's data (modified tables) plus main data (unmodified tables):

# Read from the branch -- this reads from main (zero-copy view)
products = client.branches.query(
branch_id=branch.branch_id,
sql="SELECT product_id, name, price, tier FROM products ORDER BY price",
)

for row in products:
print(f" {row['name']}: ${row['price']} ({row['tier']})")

Step 4: Write Within a Branch

Writes are isolated to the branch. The first write to a table triggers copy-on-write materialization.

# Modify pricing in the branch -- this triggers copy-on-write for the products table
client.branches.write(
branch_id=branch.branch_id,
sql="""
UPDATE products
SET price = price * 1.15,
tier = 'premium'
WHERE category = 'enterprise'
""",
)

# Verify the change in the branch
branch_products = client.branches.query(
branch_id=branch.branch_id,
sql="SELECT name, price, tier FROM products WHERE category = 'enterprise'",
)

# Verify main is unchanged
main_products = client.query(
"SELECT name, price, tier FROM products WHERE category = 'enterprise'",
)

print("Branch prices (after 15% increase):")
for row in branch_products:
print(f" {row['name']}: ${row['price']:.2f}")

print("\nMain prices (unchanged):")
for row in main_products:
print(f" {row['name']}: ${row['price']:.2f}")

Step 5: Run Analysis on the Branch

Perform analytical queries on the branch to evaluate the impact of changes:

# Simulate revenue impact with new pricing
impact = client.branches.query(
branch_id=branch.branch_id,
sql="""
SELECT
p.category,
COUNT(o.order_id) AS order_count,
SUM(p.price * o.quantity) AS projected_revenue,
SUM(main_p.price * o.quantity) AS current_revenue
FROM products p
JOIN orders o ON p.product_id = o.product_id
JOIN main.products main_p ON p.product_id = main_p.product_id
WHERE o.order_date >= '2025-10-01'
GROUP BY p.category
ORDER BY projected_revenue DESC
""",
)

print("Revenue Impact Analysis:")
for row in impact:
delta = row["projected_revenue"] - row["current_revenue"]
print(f" {row['category']}: ${row['projected_revenue']:,.0f} "
f"(+${delta:,.0f} vs current)")

Step 6: Merge or Discard

Merge the Branch

If the experiment is successful, merge the branch changes back to main:

merge_result = client.branches.merge(
branch_id=branch.branch_id,
strategy="branch_wins", # Branch changes overwrite main
)

print(f"Merge status: {merge_result.status}")
print(f"Tables merged: {merge_result.tables_merged}")
print(f"Rows affected: {merge_result.rows_affected}")

Discard the Branch

If the experiment is not worth keeping, discard it:

client.branches.discard(branch_id=branch.branch_id)
print("Branch discarded. Main data unchanged.")

Step 7: Handle Merge Conflicts

Conflicts occur when both main and the branch have modified the same rows. HatiData provides four merge strategies:

StrategyBehavior
branch_winsBranch changes overwrite main for conflicting rows
main_winsMain values are kept, branch changes for conflicting rows are discarded
manualReturns a conflict report for manual resolution
abortRolls back the merge if any conflicts are detected

Detecting Conflicts

# Attempt merge with conflict detection
merge_result = client.branches.merge(
branch_id=branch.branch_id,
strategy="manual",
)

if merge_result.has_conflicts:
print(f"Conflicts detected in {len(merge_result.conflicts)} tables:")
for conflict in merge_result.conflicts:
print(f"\n Table: {conflict.table}")
print(f" Conflicting rows: {conflict.row_count}")
for row in conflict.rows[:5]:
print(f" PK={row['pk']}: main={row['main_value']} vs branch={row['branch_value']}")

Resolving Conflicts

# After reviewing, resolve with a specific strategy
if merge_result.has_conflicts:
resolution = client.branches.resolve(
branch_id=branch.branch_id,
merge_id=merge_result.merge_id,
resolutions={
"products": "branch_wins", # Keep branch pricing changes
"customers": "main_wins", # Keep main customer data
},
)
print(f"Conflicts resolved. Final status: {resolution.status}")

Step 8: List and Monitor Branches

# List all active branches
branches = client.branches.list()
for b in branches:
print(f" {b.name} ({b.branch_id}): created {b.created_at}, "
f"tables modified: {b.modified_table_count}")

# Check branch details
detail = client.branches.get(branch_id=branch.branch_id)
print(f"Branch: {detail.name}")
print(f"Agent: {detail.agent_id}")
print(f"Modified tables: {detail.modified_tables}")
print(f"TTL expires: {detail.expires_at}")

SQL Monitoring

-- Active branches with their sizes
SELECT
branch_id,
name,
agent_id,
created_at,
modified_table_count,
total_size_bytes
FROM _hatidata_branches
WHERE status = 'active'
ORDER BY created_at DESC;

What You Built

CapabilityHatiData Feature
Safe explorationbranches.create() with schema isolation
Zero-copy branch creationSchema views (no data duplication on create)
Copy-on-write writesAutomatic materialization on first write
Revenue impact analysisCross-schema queries (main.table references)
Merge with conflict handling4 merge strategies
Auto-cleanupTTL-based garbage collection

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.