Migrate from Pinecone

Pinecone stores vectors and returns approximate nearest neighbors. HatiData stores vectors and SQL data together — you can filter by metadata, join with business tables, apply governance policies, and branch agent state, all in a single query. This guide covers what changes, how to migrate your vectors, and how to rewrite queries.

Vector-Only vs Hybrid Architecture

Capability	Pinecone	HatiData
Vector storage	Yes	Yes (built-in vector engine)
ANN similarity search	Yes	Yes (cosine, dot, L2)
SQL queries on vector data	No	Yes — `semantic_match()` in SQL
Metadata filtering	Limited key-value	Full SQL predicates
Join with business tables	Not supported	Native SQL join
Long-term agent memory	Not supported	SQL + vector hybrid
Chain-of-thought ledger	Not supported	Cryptographically hash-chained
Semantic triggers	Not supported	Built-in trigger evaluation
Branch isolation	Not supported	Per-agent schema branches
Governance + audit	Not supported	Row-level policies, audit trail
Per-agent billing	Not supported	Native
Wire protocol	REST API only	Postgres (any SQL client)

Memory Migration

Export your Pinecone vectors and import them as HatiData memories with full metadata preservation.

Step 1: Export from Pinecone

import pinecone
import json

pc = pinecone.Pinecone(api_key="your-pinecone-key")
index = pc.Index("my-index")

# Fetch all vectors in batches
exported = []
for ids_batch in fetch_all_ids(index):  # your pagination logic
    result = index.fetch(ids=ids_batch)
    for vec_id, vec_data in result.vectors.items():
        exported.append({
            "id": vec_id,
            "values": vec_data.values,
            "metadata": vec_data.metadata,
        })

with open("pinecone-export.jsonl", "w") as f:
    for item in exported:
        f.write(json.dumps(item) + "\n")

Step 2: Import as HatiData Memories

hati memory import \
  --source pinecone-export.jsonl \
  --agent-id my-agent \
  --org-id my-org \
  --format pinecone-jsonl

The importer maps Pinecone metadata keys to HatiData memory fields and re-indexes vectors in the built-in vector engine with full SQL metadata mirrored in the data layer.

Step 3: Verify Import

-- Confirm memory count
SELECT COUNT(*) FROM _hatidata_memories
WHERE agent_id = 'my-agent';

-- Spot-check a migrated memory
SELECT memory_id, content, metadata, created_at
FROM _hatidata_memories
WHERE agent_id = 'my-agent'
LIMIT 5;

Query Migration

Pinecone Query API vs HatiData SQL

# Before: Pinecone REST query
result = index.query(
    vector=query_embedding,
    top_k=10,
    filter={"source": {"$eq": "customer-support"}},
    include_metadata=True,
)

# After: HatiData SQL via any Postgres client
import psycopg2
conn = psycopg2.connect("postgresql://myuser:mypass@localhost:5439/mydb")
cur = conn.cursor()

cur.execute("""
    SELECT memory_id, content, metadata,
           semantic_match(content, %s) AS similarity
    FROM _hatidata_memories
    WHERE agent_id = 'my-agent'
      AND metadata->>'source' = 'customer-support'
    ORDER BY similarity DESC
    LIMIT 10
""", (query_embedding,))

Hybrid Search: SQL + Vector Together

HatiData's hybrid search combines vector ANN pre-filtering with exact SQL joins — you get vector recall with SQL precision.

-- Find memories similar to a query, filtered by recency and joined with events
SELECT
    m.memory_id,
    m.content,
    semantic_match(m.content, :query_embedding) AS similarity,
    e.event_type,
    e.occurred_at
FROM _hatidata_memories m
JOIN agent_events e USING (session_id)
WHERE m.agent_id = 'my-agent'
  AND m.created_at > NOW() - INTERVAL '7 days'
ORDER BY similarity DESC
LIMIT 20;

This query is not possible in a vector-only database.

Using the Python SDK

from hatidata import HatiDataClient

client = HatiDataClient(
    connection="postgresql://myuser:mypass@localhost:5439/mydb",
    agent_id="my-agent",
)

# Store a memory (replaces index.upsert)
client.memory.store(
    content="User prefers concise responses",
    metadata={"source": "preference", "confidence": 0.9},
)

# Search memories (replaces index.query)
results = client.memory.search(
    query="response style preferences",
    top_k=5,
    filters={"source": "preference"},
)

What You Gain

Moving from a vector-only database to HatiData gives agents a complete cognitive infrastructure:

SQL queries on vector data — filter, join, aggregate alongside embeddings
Hybrid search — vector ANN pre-filter + exact cosine verification for high-recall, high-precision results
Governance — row-level policies, audit trails, and per-agent access control on every memory read
Chain-of-thought ledger — immutable, cryptographically hash-chained reasoning traces stored alongside memories
Semantic triggers — fire webhooks or agent notifications when stored content crosses a similarity threshold
Branch isolation — create a copy-on-write branch of agent state for safe experimentation
Single wire protocol — Postgres-compatible, works with psycopg2, asyncpg, SQLAlchemy, dbt, and any BI tool

Persistent Memory — Architecture and API
Hybrid SQL — ANN + SQL combined
Semantic Triggers — Cosine-based event firing
Branch Isolation — Per-agent schema branching
Chain-of-Thought Ledger — Immutable reasoning traces

Vector-Only vs Hybrid Architecture​

Memory Migration​

Step 1: Export from Pinecone​

Step 2: Import as HatiData Memories​

Step 3: Verify Import​

Query Migration​

Pinecone Query API vs HatiData SQL​

Hybrid Search: SQL + Vector Together​

Using the Python SDK​

What You Gain​

Related Concepts​

Stay in the loop