SurrealDB Docs Logo

Enter a search query

Ollama

Ollama provides specialized embeddings for niche applications, and SurrealDB has first-class vector-search support (k-nearest-neighbour via brute-force, HNSW or M-Tree). Together they make it easy to build retrieval-augmented-generation (RAG) pipelines completely in Python.

Installation

pip install ollama surrealdb           # SurrealDB Python SDK ≥ 1.0.0

The SDK talks to a running SurrealDB server (e.g. surreal start --log trace --auth root root). ([PyPI][2])

Integration Example

The snippet below assumes:

  • SurrealDB is listening on its WebSocket RPC endpoint ws://localhost:8000/rpc
  • Ollama is on its default port 11434
  • We’ll store vectors in a table called NicheApplications and index them with HNSW for fast similarity search.
import asyncio from surrealdb import Surreal # async-capable Python client import ollama TABLE = "NicheApplications" async def main(): # ----- connect to SurrealDB ------------------------------------------------ db = Surreal("ws://localhost:8000/rpc") await db.connect() await db.signin({"user": "root", "pass": "root"}) await db.use("test", "test") # <namespace>, <database> # ----- generate an embedding with Ollama ----------------------------------- oclient = ollama.Client(host="localhost") text = "Ollama excels in niche applications with specific embeddings" emb = oclient.embeddings(model="llama3.2", prompt=text)["embedding"] # ----- (idempotent) schema & index setup ----------------------------------- await db.query(` DEFINE TABLE IF NOT EXISTS {TABLE}; DEFINE FIELD embedding ON {TABLE} TYPE array; DEFINE FIELD text ON {TABLE} TYPE string; -- HNSW index for DIMENSION = vector length DEFINE INDEX hnsw_embedding ON {TABLE} FIELDS embedding HNSW DIMENSION {len(emb)}; `) # ----- store the record ----------------------------------------------------- await db.create(TABLE, {"text": text, "embedding": emb}) # ----- similarity search (top-3 neighbours) -------------------------------- results = await db.query(` LET $q = $embedding; SELECT *, vector::distance::cosine(embedding, $q) AS score FROM NicheApplications WHERE embedding <|3|> $q -- KNN operator ORDER BY score; -- lower = more similar `, {"embedding": emb}) print(results[0]) # array of matching rows with cosine distance asyncio.run(main())

What the query does

  • embedding <|3|> $q is SurrealQL’s KNN operator: return the 3 vectors nearest to $q. You can optionally pass a distance metric (<|3,COSINE|>), but when you also compute vector::distance::cosine(...) in the projection you usually just need the count. ([SurrealDB][3])
  • vector::distance::cosine(embedding, $q) adds an explicit similarity score so you can ORDER BY it or filter further.

Tips & Next Steps

TopicHow-to
Batch insertsWrap multiple CREATE statements in a single db.query("""…""") block for better throughput.
FilteringCombine the KNN operator with ordinary WHERE clauses (flag = true, ranges, etc.).
Index rebuildsIf you bulk-import data, run REBUILD INDEX hnsw_embedding ON NicheApplications once at the end.
Other metricsUse vector::distance::euclidean, manhattan, etc., or specify the metric directly in <k,METRIC>.

SurrealDB’s multi-model nature means you can keep metadata, graphs and time-series data right alongside your vectors, simplifying your stack even further.