Migrate from Pinecone
Pinecone stores vectors and returns approximate nearest neighbors. HatiData stores vectors and SQL data together — you can filter by metadata, join with business tables, apply governance policies, and branch agent state, all in a single query. This guide covers what changes, how to migrate your vectors, and how to rewrite queries.
Vector-Only vs Hybrid Architecture
| Capability | Pinecone | HatiData |
|---|---|---|
| Vector storage | Yes | Yes (built-in vector engine) |
| ANN similarity search | Yes | Yes (cosine, dot, L2) |
| SQL queries on vector data | No | Yes — semantic_match() in SQL |
| Metadata filtering | Limited key-value | Full SQL predicates |
| Join with business tables | Not supported | Native SQL join |
| Long-term agent memory | Not supported | SQL + vector hybrid |
| Chain-of-thought ledger | Not supported | Cryptographically hash-chained |
| Semantic triggers | Not supported | Built-in trigger evaluation |
| Branch isolation | Not supported | Per-agent schema branches |
| Governance + audit | Not supported | Row-level policies, audit trail |
| Per-agent billing | Not supported | Native |
| Wire protocol | REST API only | Postgres (any SQL client) |
Memory Migration
Export your Pinecone vectors and import them as HatiData memories with full metadata preservation.
Step 1: Export from Pinecone
import pinecone
import json
pc = pinecone.Pinecone(api_key="your-pinecone-key")
index = pc.Index("my-index")
# Fetch all vectors in batches
exported = []
for ids_batch in fetch_all_ids(index): # your pagination logic
result = index.fetch(ids=ids_batch)
for vec_id, vec_data in result.vectors.items():
exported.append({
"id": vec_id,
"values": vec_data.values,
"metadata": vec_data.metadata,
})
with open("pinecone-export.jsonl", "w") as f:
for item in exported:
f.write(json.dumps(item) + "\n")
Step 2: Import as HatiData Memories
hati memory import \
--source pinecone-export.jsonl \
--agent-id my-agent \
--org-id my-org \
--format pinecone-jsonl
The importer maps Pinecone metadata keys to HatiData memory fields and re-indexes vectors in the built-in vector engine with full SQL metadata mirrored in the data layer.
Step 3: Verify Import
-- Confirm memory count
SELECT COUNT(*) FROM _hatidata_memories
WHERE agent_id = 'my-agent';
-- Spot-check a migrated memory
SELECT memory_id, content, metadata, created_at
FROM _hatidata_memories
WHERE agent_id = 'my-agent'
LIMIT 5;
Query Migration
Pinecone Query API vs HatiData SQL
# Before: Pinecone REST query
result = index.query(
vector=query_embedding,
top_k=10,
filter={"source": {"$eq": "customer-support"}},
include_metadata=True,
)
# After: HatiData SQL via any Postgres client
import psycopg2
conn = psycopg2.connect("postgresql://myuser:mypass@localhost:5439/mydb")
cur = conn.cursor()
cur.execute("""
SELECT memory_id, content, metadata,
semantic_match(content, %s) AS similarity
FROM _hatidata_memories
WHERE agent_id = 'my-agent'
AND metadata->>'source' = 'customer-support'
ORDER BY similarity DESC
LIMIT 10
""", (query_embedding,))
Hybrid Search: SQL + Vector Together
HatiData's hybrid search combines vector ANN pre-filtering with exact SQL joins — you get vector recall with SQL precision.
-- Find memories similar to a query, filtered by recency and joined with events
SELECT
m.memory_id,
m.content,
semantic_match(m.content, :query_embedding) AS similarity,
e.event_type,
e.occurred_at
FROM _hatidata_memories m
JOIN agent_events e USING (session_id)
WHERE m.agent_id = 'my-agent'
AND m.created_at > NOW() - INTERVAL '7 days'
ORDER BY similarity DESC
LIMIT 20;
This query is not possible in a vector-only database.
Using the Python SDK
from hatidata import HatiDataClient
client = HatiDataClient(
connection="postgresql://myuser:mypass@localhost:5439/mydb",
agent_id="my-agent",
)
# Store a memory (replaces index.upsert)
client.memory.store(
content="User prefers concise responses",
metadata={"source": "preference", "confidence": 0.9},
)
# Search memories (replaces index.query)
results = client.memory.search(
query="response style preferences",
top_k=5,
filters={"source": "preference"},
)
What You Gain
Moving from a vector-only database to HatiData gives agents a complete cognitive infrastructure:
- SQL queries on vector data — filter, join, aggregate alongside embeddings
- Hybrid search — vector ANN pre-filter + exact cosine verification for high-recall, high-precision results
- Governance — row-level policies, audit trails, and per-agent access control on every memory read
- Chain-of-thought ledger — immutable, cryptographically hash-chained reasoning traces stored alongside memories
- Semantic triggers — fire webhooks or agent notifications when stored content crosses a similarity threshold
- Branch isolation — create a copy-on-write branch of agent state for safe experimentation
- Single wire protocol — Postgres-compatible, works with psycopg2, asyncpg, SQLAlchemy, dbt, and any BI tool
Related Concepts
- Persistent Memory — Architecture and API
- Hybrid SQL — ANN + SQL combined
- Semantic Triggers — Cosine-based event firing
- Branch Isolation — Per-agent schema branching
- Chain-of-Thought Ledger — Immutable reasoning traces