Skip to main content

Ollama Integration (Air-Gapped)

HatiData runs entirely locally with Ollama for fully air-gapped deployments where no data or queries leave your network. This combination gives agents access to a SQL data warehouse and persistent memory backed by a local LLM -- ideal for regulated industries, classified environments, and privacy-sensitive workloads.

Installation

Install Ollama

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

Pull a model:

ollama pull llama3.1

Install HatiData

# Install the CLI
cargo install hatidata-cli

# Initialize a local instance (runs DuckDB locally, no cloud connection)
HATIDATA_DEV_MODE=true hati init

Install the Python SDK

pip install hatidata-agent

Configure

With both Ollama and HatiData running locally, configure the agent to connect to local endpoints:

from hatidata_agent import HatiDataAgent

hati = HatiDataAgent(
host="localhost",
port=5439,
agent_id="ollama-agent",
framework="ollama",
database="hatidata",
)

No API key is needed in local dev mode. The proxy accepts unauthenticated connections when HATIDATA_DEV_MODE=true.

ParameterDefaultDescription
host"localhost"HatiData proxy hostname
port5439Proxy port
agent_id"agent"Unique identifier for audit trails
framework"custom"Set to "ollama" for proper tracking
database"hatidata"Target database

Basic Usage

Use Ollama's Python client alongside HatiData for a fully local agent:

import ollama
from hatidata_agent import HatiDataAgent

# Local HatiData connection
hati = HatiDataAgent(
host="localhost",
port=5439,
agent_id="local-analyst",
framework="ollama",
)

# Define tool functions
def query_warehouse(sql: str) -> str:
"""Execute a SQL query against the local HatiData warehouse."""
try:
rows = hati.query(sql)
return str(rows) if rows else "No results."
except Exception as e:
return f"Error: {e}"

def list_tables() -> str:
"""List all available tables."""
rows = hati.query(
"SELECT table_name FROM information_schema.tables WHERE table_schema = 'main'"
)
return ", ".join(r["table_name"] for r in rows)

def describe_table(table_name: str) -> str:
"""Get column names and types for a table."""
rows = hati.query(
f"SELECT column_name, data_type FROM information_schema.columns "
f"WHERE table_name = '{table_name}'"
)
return "\n".join(f"{r['column_name']} {r['data_type']}" for r in rows)

# Define tools for Ollama's tool calling
tools = [
{
"type": "function",
"function": {
"name": "query_warehouse",
"description": "Execute a SQL query against the data warehouse",
"parameters": {
"type": "object",
"properties": {
"sql": {"type": "string", "description": "The SQL query to execute"}
},
"required": ["sql"],
},
},
},
{
"type": "function",
"function": {
"name": "list_tables",
"description": "List all available tables in the data warehouse",
"parameters": {"type": "object", "properties": {}},
},
},
{
"type": "function",
"function": {
"name": "describe_table",
"description": "Get column names and data types for a specific table",
"parameters": {
"type": "object",
"properties": {
"table_name": {"type": "string", "description": "Name of the table to describe"}
},
"required": ["table_name"],
},
},
},
]

# Tool dispatch
tool_map = {
"query_warehouse": query_warehouse,
"list_tables": list_tables,
"describe_table": describe_table,
}

# Agent loop
messages = [
{
"role": "system",
"content": "You are a data analyst. Use the available tools to explore the warehouse schema and answer questions with SQL.",
},
{
"role": "user",
"content": "What are the top 5 products by revenue?",
},
]

# First LLM call -- model decides which tools to use
response = ollama.chat(model="llama3.1", messages=messages, tools=tools)

# Process tool calls
while response.message.tool_calls:
for tool_call in response.message.tool_calls:
fn_name = tool_call.function.name
fn_args = tool_call.function.arguments
result = tool_map[fn_name](**fn_args)

messages.append(response.message)
messages.append({
"role": "tool",
"content": result,
})

# Next LLM call with tool results
response = ollama.chat(model="llama3.1", messages=messages, tools=tools)

print(response.message.content)

Air-Gapped Architecture

In an air-gapped deployment, nothing leaves the local machine or private network:

┌─────────────────────────────────────────────┐
│ Local Machine │
│ │
│ Ollama (llama3.1) │
│ ↕ │
│ Python Agent Script │
│ ↕ │
│ HatiData Proxy (:5439) │
│ ↕ │
│ DuckDB (local files) │
└─────────────────────────────────────────────┘
No outbound network connections
  • LLM inference: Ollama runs the model locally on CPU or GPU
  • SQL execution: HatiData proxy executes queries against a local DuckDB instance
  • Data storage: All data remains on local disk
  • Audit logging: Queries are logged locally for compliance

Loading Local Data

Import data files into your local HatiData instance:

# Import CSV
hati query "CREATE TABLE customers AS SELECT * FROM read_csv_auto('customers.csv')"

# Import Parquet
hati query "CREATE TABLE transactions AS SELECT * FROM read_parquet('transactions.parquet')"

Or load data via SQL:

hati.query("""
CREATE TABLE customers AS
SELECT * FROM read_csv_auto('/path/to/customers.csv')
""")

hati.query("""
CREATE TABLE transactions AS
SELECT * FROM read_parquet('/path/to/transactions.parquet')
""")

Memory in Air-Gapped Mode

Agent memory works identically in local mode. Memories are stored in DuckDB tables with optional local embeddings:

# Store a memory
hati.store_memory(
content="Q4 revenue was $2.1M, up 15% from Q3",
memory_type="fact",
importance=0.9,
)

# Search memories (metadata-based in local mode without an embedding provider)
results = hati.search_memory(query="revenue trends", top_k=5)
Local Embeddings

For vector search in air-gapped mode, configure a local embedding provider (e.g., a sentence-transformers model running alongside Ollama). Without an embedding provider, memory search falls back to metadata-based matching, which is less precise but still functional. See Agent Memory for configuration details.

Using with LangChain + Ollama

Combine LangChain's Ollama integration with HatiData for a richer agent experience:

from langchain_ollama import ChatOllama
from langchain.agents import AgentExecutor, create_react_agent
from langchain_hatidata import HatiDataToolkit

# Local LLM
llm = ChatOllama(model="llama3.1", temperature=0)

# Local HatiData tools
toolkit = HatiDataToolkit(
host="localhost",
port=5439,
agent_id="langchain-ollama",
)

tools = toolkit.get_tools()
agent = create_react_agent(llm=llm, tools=tools, prompt=prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

result = executor.invoke({"input": "Summarize the sales data"})

Next Steps

Stay in the loop

Product updates, engineering deep-dives, and agent-native insights. No spam.