Ollama Integration (Air-Gapped)
HatiData runs entirely locally with Ollama for fully air-gapped deployments where no data or queries leave your network. This combination gives agents access to a SQL data warehouse and persistent memory backed by a local LLM -- ideal for regulated industries, classified environments, and privacy-sensitive workloads.
Installation
Install Ollama
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
Pull a model:
ollama pull llama3.1
Install HatiData
# Install the CLI
cargo install hatidata-cli
# Initialize a local instance (runs DuckDB locally, no cloud connection)
HATIDATA_DEV_MODE=true hati init
Install the Python SDK
pip install hatidata-agent
Configure
With both Ollama and HatiData running locally, configure the agent to connect to local endpoints:
from hatidata_agent import HatiDataAgent
hati = HatiDataAgent(
host="localhost",
port=5439,
agent_id="ollama-agent",
framework="ollama",
database="hatidata",
)
No API key is needed in local dev mode. The proxy accepts unauthenticated connections when HATIDATA_DEV_MODE=true.
| Parameter | Default | Description |
|---|---|---|
host | "localhost" | HatiData proxy hostname |
port | 5439 | Proxy port |
agent_id | "agent" | Unique identifier for audit trails |
framework | "custom" | Set to "ollama" for proper tracking |
database | "hatidata" | Target database |
Basic Usage
Use Ollama's Python client alongside HatiData for a fully local agent:
import ollama
from hatidata_agent import HatiDataAgent
# Local HatiData connection
hati = HatiDataAgent(
host="localhost",
port=5439,
agent_id="local-analyst",
framework="ollama",
)
# Define tool functions
def query_warehouse(sql: str) -> str:
"""Execute a SQL query against the local HatiData warehouse."""
try:
rows = hati.query(sql)
return str(rows) if rows else "No results."
except Exception as e:
return f"Error: {e}"
def list_tables() -> str:
"""List all available tables."""
rows = hati.query(
"SELECT table_name FROM information_schema.tables WHERE table_schema = 'main'"
)
return ", ".join(r["table_name"] for r in rows)
def describe_table(table_name: str) -> str:
"""Get column names and types for a table."""
rows = hati.query(
f"SELECT column_name, data_type FROM information_schema.columns "
f"WHERE table_name = '{table_name}'"
)
return "\n".join(f"{r['column_name']} {r['data_type']}" for r in rows)
# Define tools for Ollama's tool calling
tools = [
{
"type": "function",
"function": {
"name": "query_warehouse",
"description": "Execute a SQL query against the data warehouse",
"parameters": {
"type": "object",
"properties": {
"sql": {"type": "string", "description": "The SQL query to execute"}
},
"required": ["sql"],
},
},
},
{
"type": "function",
"function": {
"name": "list_tables",
"description": "List all available tables in the data warehouse",
"parameters": {"type": "object", "properties": {}},
},
},
{
"type": "function",
"function": {
"name": "describe_table",
"description": "Get column names and data types for a specific table",
"parameters": {
"type": "object",
"properties": {
"table_name": {"type": "string", "description": "Name of the table to describe"}
},
"required": ["table_name"],
},
},
},
]
# Tool dispatch
tool_map = {
"query_warehouse": query_warehouse,
"list_tables": list_tables,
"describe_table": describe_table,
}
# Agent loop
messages = [
{
"role": "system",
"content": "You are a data analyst. Use the available tools to explore the warehouse schema and answer questions with SQL.",
},
{
"role": "user",
"content": "What are the top 5 products by revenue?",
},
]
# First LLM call -- model decides which tools to use
response = ollama.chat(model="llama3.1", messages=messages, tools=tools)
# Process tool calls
while response.message.tool_calls:
for tool_call in response.message.tool_calls:
fn_name = tool_call.function.name
fn_args = tool_call.function.arguments
result = tool_map[fn_name](**fn_args)
messages.append(response.message)
messages.append({
"role": "tool",
"content": result,
})
# Next LLM call with tool results
response = ollama.chat(model="llama3.1", messages=messages, tools=tools)
print(response.message.content)
Air-Gapped Architecture
In an air-gapped deployment, nothing leaves the local machine or private network:
┌─────────────────────────────────────────────┐
│ Local Machine │
│ │
│ Ollama (llama3.1) │
│ ↕ │
│ Python Agent Script │
│ ↕ │
│ HatiData Proxy (:5439) │
│ ↕ │
│ DuckDB (local files) │
└─────────────────────────────────────────────┘
No outbound network connections
- LLM inference: Ollama runs the model locally on CPU or GPU
- SQL execution: HatiData proxy executes queries against a local DuckDB instance
- Data storage: All data remains on local disk
- Audit logging: Queries are logged locally for compliance
Loading Local Data
Import data files into your local HatiData instance:
# Import CSV
hati query "CREATE TABLE customers AS SELECT * FROM read_csv_auto('customers.csv')"
# Import Parquet
hati query "CREATE TABLE transactions AS SELECT * FROM read_parquet('transactions.parquet')"
Or load data via SQL:
hati.query("""
CREATE TABLE customers AS
SELECT * FROM read_csv_auto('/path/to/customers.csv')
""")
hati.query("""
CREATE TABLE transactions AS
SELECT * FROM read_parquet('/path/to/transactions.parquet')
""")
Memory in Air-Gapped Mode
Agent memory works identically in local mode. Memories are stored in DuckDB tables with optional local embeddings:
# Store a memory
hati.store_memory(
content="Q4 revenue was $2.1M, up 15% from Q3",
memory_type="fact",
importance=0.9,
)
# Search memories (metadata-based in local mode without an embedding provider)
results = hati.search_memory(query="revenue trends", top_k=5)
For vector search in air-gapped mode, configure a local embedding provider (e.g., a sentence-transformers model running alongside Ollama). Without an embedding provider, memory search falls back to metadata-based matching, which is less precise but still functional. See Agent Memory for configuration details.
Using with LangChain + Ollama
Combine LangChain's Ollama integration with HatiData for a richer agent experience:
from langchain_ollama import ChatOllama
from langchain.agents import AgentExecutor, create_react_agent
from langchain_hatidata import HatiDataToolkit
# Local LLM
llm = ChatOllama(model="llama3.1", temperature=0)
# Local HatiData tools
toolkit = HatiDataToolkit(
host="localhost",
port=5439,
agent_id="langchain-ollama",
)
tools = toolkit.get_tools()
agent = create_react_agent(llm=llm, tools=tools, prompt=prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
result = executor.invoke({"input": "Summarize the sales data"})
Next Steps
- Local Mode Setup -- Full local installation guide
- Agent Memory -- How persistent memory works
- Agent Integrations -- All supported frameworks
- SQL Compatibility -- Supported SQL syntax