Branch-Based Exploration

In this tutorial you will use HatiData's branch isolation to let agents explore data safely without affecting production state. Branches create schema-level isolation with copy-on-write semantics -- zero-copy on creation, data is only duplicated when a branch writes to a table.

By the end you will have:

Created branches for safe agent exploration
Run queries and writes within a branch
Merged branch results back to main
Handled merge conflicts with different strategies

Prerequisites

Python 3.10+
HatiData proxy running locally or in the cloud
hatidata SDK installed

pip install hatidata

export HATIDATA_API_KEY="hd_live_your_api_key"
export HATIDATA_HOST="localhost"

Step 1: Understand Branch Architecture

HatiData branches use schema-based isolation:

main (default schema)
├── customers
├── orders
└── products

branch_abc123 (exploration branch)
├── customers  →  VIEW pointing to main.customers (zero-copy)
├── orders     →  VIEW pointing to main.orders (zero-copy)
└── products   →  MATERIALIZED (copy-on-write, modified by agent)

When a branch is created, every table in main gets a zero-copy view in the branch schema. The first time the agent writes to a table within the branch, that table is materialized as a full copy. Reads from unmodified tables still reference the original data.

Step 2: Create a Branch

import os
from hatidata import HatiDataClient

client = HatiDataClient(
    host=os.environ["HATIDATA_HOST"],
    port=5439,
    api_key=os.environ["HATIDATA_API_KEY"],
)

# Create a branch for exploration
branch = client.branches.create(
    name="pricing-experiment",
    description="Testing new pricing tiers before applying to production",
    agent_id="pricing-agent",
    ttl_hours=24,  # Auto-cleanup after 24 hours if not merged
)

print(f"Branch created: {branch.branch_id}")
print(f"Schema: {branch.schema_name}")

Step 3: Query Within a Branch

Queries within a branch see the branch's data (modified tables) plus main data (unmodified tables):

# Read from the branch -- this reads from main (zero-copy view)
products = client.branches.query(
    branch_id=branch.branch_id,
    sql="SELECT product_id, name, price, tier FROM products ORDER BY price",
)

for row in products:
    print(f"  {row['name']}: ${row['price']} ({row['tier']})")

Step 4: Write Within a Branch

Writes are isolated to the branch. The first write to a table triggers copy-on-write materialization.

# Modify pricing in the branch -- this triggers copy-on-write for the products table
client.branches.write(
    branch_id=branch.branch_id,
    sql="""
        UPDATE products
        SET price = price * 1.15,
            tier = 'premium'
        WHERE category = 'enterprise'
    """,
)

# Verify the change in the branch
branch_products = client.branches.query(
    branch_id=branch.branch_id,
    sql="SELECT name, price, tier FROM products WHERE category = 'enterprise'",
)

# Verify main is unchanged
main_products = client.query(
    "SELECT name, price, tier FROM products WHERE category = 'enterprise'",
)

print("Branch prices (after 15% increase):")
for row in branch_products:
    print(f"  {row['name']}: ${row['price']:.2f}")

print("\nMain prices (unchanged):")
for row in main_products:
    print(f"  {row['name']}: ${row['price']:.2f}")

Step 5: Run Analysis on the Branch

Perform analytical queries on the branch to evaluate the impact of changes:

# Simulate revenue impact with new pricing
impact = client.branches.query(
    branch_id=branch.branch_id,
    sql="""
        SELECT
            p.category,
            COUNT(o.order_id) AS order_count,
            SUM(p.price * o.quantity) AS projected_revenue,
            SUM(main_p.price * o.quantity) AS current_revenue
        FROM products p
        JOIN orders o ON p.product_id = o.product_id
        JOIN main.products main_p ON p.product_id = main_p.product_id
        WHERE o.order_date >= '2025-10-01'
        GROUP BY p.category
        ORDER BY projected_revenue DESC
    """,
)

print("Revenue Impact Analysis:")
for row in impact:
    delta = row["projected_revenue"] - row["current_revenue"]
    print(f"  {row['category']}: ${row['projected_revenue']:,.0f} "
          f"(+${delta:,.0f} vs current)")

Step 6: Merge or Discard

Merge the Branch

If the experiment is successful, merge the branch changes back to main:

merge_result = client.branches.merge(
    branch_id=branch.branch_id,
    strategy="branch_wins",  # Branch changes overwrite main
)

print(f"Merge status: {merge_result.status}")
print(f"Tables merged: {merge_result.tables_merged}")
print(f"Rows affected: {merge_result.rows_affected}")

Discard the Branch

If the experiment is not worth keeping, discard it:

client.branches.discard(branch_id=branch.branch_id)
print("Branch discarded. Main data unchanged.")

Step 7: Handle Merge Conflicts

Conflicts occur when both main and the branch have modified the same rows. HatiData provides four merge strategies:

Strategy	Behavior
`branch_wins`	Branch changes overwrite main for conflicting rows
`main_wins`	Main values are kept, branch changes for conflicting rows are discarded
`manual`	Returns a conflict report for manual resolution
`abort`	Rolls back the merge if any conflicts are detected

Detecting Conflicts

# Attempt merge with conflict detection
merge_result = client.branches.merge(
    branch_id=branch.branch_id,
    strategy="manual",
)

if merge_result.has_conflicts:
    print(f"Conflicts detected in {len(merge_result.conflicts)} tables:")
    for conflict in merge_result.conflicts:
        print(f"\n  Table: {conflict.table}")
        print(f"  Conflicting rows: {conflict.row_count}")
        for row in conflict.rows[:5]:
            print(f"    PK={row['pk']}: main={row['main_value']} vs branch={row['branch_value']}")

Resolving Conflicts

# After reviewing, resolve with a specific strategy
if merge_result.has_conflicts:
    resolution = client.branches.resolve(
        branch_id=branch.branch_id,
        merge_id=merge_result.merge_id,
        resolutions={
            "products": "branch_wins",  # Keep branch pricing changes
            "customers": "main_wins",   # Keep main customer data
        },
    )
    print(f"Conflicts resolved. Final status: {resolution.status}")

Step 8: List and Monitor Branches

# List all active branches
branches = client.branches.list()
for b in branches:
    print(f"  {b.name} ({b.branch_id}): created {b.created_at}, "
          f"tables modified: {b.modified_table_count}")

# Check branch details
detail = client.branches.get(branch_id=branch.branch_id)
print(f"Branch: {detail.name}")
print(f"Agent: {detail.agent_id}")
print(f"Modified tables: {detail.modified_tables}")
print(f"TTL expires: {detail.expires_at}")

SQL Monitoring

-- Active branches with their sizes
SELECT
    branch_id,
    name,
    agent_id,
    created_at,
    modified_table_count,
    total_size_bytes
FROM _hatidata_branches
WHERE status = 'active'
ORDER BY created_at DESC;

What You Built

Capability	HatiData Feature
Safe exploration	`branches.create()` with schema isolation
Zero-copy branch creation	Schema views (no data duplication on create)
Copy-on-write writes	Automatic materialization on first write
Revenue impact analysis	Cross-schema queries (`main.table` references)
Merge with conflict handling	4 merge strategies
Auto-cleanup	TTL-based garbage collection

Branch Isolation -- Full architecture reference
Branch Recipes -- Advanced branching patterns
Research Agent Tutorial -- Branching for research agents
MCP Tools Reference -- branch_create, branch_merge tools
Concurrency Model -- How branches handle concurrent access

Prerequisites​

Step 1: Understand Branch Architecture​

Step 2: Create a Branch​

Step 3: Query Within a Branch​

Step 4: Write Within a Branch​

Step 5: Run Analysis on the Branch​

Step 6: Merge or Discard​

Merge the Branch​

Discard the Branch​

Step 7: Handle Merge Conflicts​

Detecting Conflicts​

Resolving Conflicts​

Step 8: List and Monitor Branches​

SQL Monitoring​

What You Built​

Related Concepts​

Stay in the loop