Modern open-source RAG pipelines need two things:
mistral-embed
) returns 1 024-dimensional float vectors that rival OpenAI + Cohere.<|K,EF|>
operator in SurrealQL.Below you’ll wire them together, from install → ingestion → search → production-ready script.
pip install mistralai surrealdb
Set two environment variables (or hard-code them if you must):
export SDB_URL="http://localhost:8000/rpc" # ← SurrealDB RPC endpoint export MISTRAL_API_KEY="sk-…" # ← your Mistral key
from mistralai.client import MistralClient from surrealdb import Surreal import os, asyncio # ----- 1.1 · Config ----------------------------------------------------------------- SDB_URL = os.getenv("SDB_URL", "http://localhost:8000/rpc") SDB_USER = os.getenv("SDB_USER", "root") SDB_PASS = os.getenv("SDB_PASS", "secret") NS, DB = "demo", "demo" TABLE = "mistral_docs" MODEL = "mistral-embed" # ----- 1.2 · Clients ---------------------------------------------------------------- sdb = Surreal(SDB_URL) mistr = MistralClient(api_key=os.environ["MISTRAL_API_KEY"]) async def init_db(): await sdb.signin({"user": SDB_USER, "pass": SDB_PASS}) await sdb.use(NS, DB) # one quick embedding → get true vector dimension dim = len(mistr.embeddings(model=MODEL, input=["ping"]).data[0].embedding) schema = """ DEFINE TABLE $tb SCHEMALESS PERMISSIONS NONE; DEFINE FIELD text ON $tb TYPE string; DEFINE FIELD embedding ON $tb TYPE array; DEFINE INDEX hnsw_idx ON $tb FIELDS embedding HNSW DIMENSION {dim} DIST COSINE; """ await sdb.query(schema, {"tb": TABLE}) asyncio.run(init_db())
Future-proofing: if Mistral introduces a small or large Embed model with a different dimension, the code auto-adapts.
DOCS = [ "SurrealDB offers an in-memory HNSW vector index for low-latency search.", "Mistral-Embed produces 1 024-dimensional embeddings.", "You can build a completely open-source RAG stack with these two tools.", ] async def insert_docs(docs, batch=64): rows = [] for i in range(0, len(docs), batch): chunk = docs[i : i + batch] vecs = mistr.embeddings(model=MODEL, input=chunk).data rows += [ { "id": f"{TABLE}:{i+j}", "text": chunk[j], "embedding": vec.embedding, } for j, vec in enumerate(vecs) ] await sdb.query(f"INSERT INTO {TABLE} $data", {"data": rows}) asyncio.run(insert_docs(DOCS))
Why bulk-insert? One SurrealQL call → one network round-trip — much faster than inserting row-by-row.
async def search(query: str, k: int = 3, ef: int = 64): q_vec = mistr.embeddings(model=MODEL, input=[query]).data[0].embedding surql = """ LET $q := $vec; SELECT id, text, vector::distance::knn() AS score FROM $tb WHERE embedding <|{k},{ef}|> $q ORDER BY score; """ res = await sdb.query(surql, {"vec": q_vec, "tb": TABLE}) return res[0].result hits = asyncio.run(search("Which database supports native vector search?")) for h in hits: print(f"⭐ {h['text']} (score={h['score']:.4f})")
<|K,EF|>
activates the HNSW K-nearest-neighbour operator (K=3
, efSearch=64
). vector::distance::knn()
exposes the cosine distance already computed inside the index—no post-processing needed.
# mistral_surreal_demo.py from __future__ import annotations import os, asyncio from mistralai.client import MistralClient from surrealdb import Surreal SDB_URL = os.getenv("SDB_URL", "http://localhost:8000/rpc") SDB_USER = os.getenv("SDB_USER", "root") SDB_PASS = os.getenv("SDB_PASS", "secret") NS, DB, TABLE = "demo", "demo", "mistral_docs" MODEL = "mistral-embed" KEY = os.environ["MISTRAL_API_KEY"] # export first! sdb, mistr = Surreal(SDB_URL), MistralClient(api_key=KEY) DOCS = [ "SurrealDB's vector index is built on HNSW.", "Mistral-Embed vectors offer strong semantic quality.", "Together they form a fast, open-source search stack.", ] async def main(): await sdb.signin({"user": SDB_USER, "pass": SDB_PASS}) await sdb.use(NS, DB) dim = len(mistr.embeddings(model=MODEL, input=["x"]).data[0].embedding) await sdb.query(""" DEFINE TABLE $tb SCHEMALESS PERMISSIONS NONE; DEFINE FIELD text ON $tb TYPE string; DEFINE FIELD embedding ON $tb TYPE array; DEFINE INDEX hnsw_idx ON $tb FIELDS embedding HNSW DIMENSION {dim} DIST COSINE; """, {"tb": TABLE}) # ingest if empty if (await sdb.query(f"SELECT count() FROM {TABLE};"))[0].result[0]["count"] == 0: rows = [] vecs = mistr.embeddings(model=MODEL, input=DOCS).data rows = [ {"id": f"{TABLE}:{i}", "text": DOCS[i], "embedding": v.embedding} for i, v in enumerate(vecs) ] await sdb.query(f"INSERT INTO {TABLE} $data", {"data": rows}) # search q_vec = mistr.embeddings(model=MODEL, input=["open-source vector database"] ).data[0].embedding res = await sdb.query(""" LET $q := $vec; SELECT text, vector::distance::knn() AS score FROM {TABLE} WHERE embedding <|3,64|> $q ORDER BY score; """, {"vec": q_vec}) print(res[0].result) if __name__ == "__main__": asyncio.run(main())
SurrealDB currently stores vectors as float32
/ float64
arrays and does not ship built-in binary or int8 quantisation. If memory is critical you can:
You now have a clean, fully-async SurrealDB setup that stores Mistral-Embed vectors, supports fast HNSW search, and can be dropped into any RAG or semantic-search workflow.
Create a new Cargo project with cargo new project_name
and go into the project folder, then add the following dependencies inside Cargo.toml
:
anyhow = "1.0.98" mistralai-client = "0.14.0" serde = "1.0.219" surrealdb = { version = "2.3", features = ["kv-mem"] } tokio = "1.45.0"
You can add the same dependencies on the command line through a single command:
cargo add anyhow mistralai-client serde tokio surrealdb --features surrealdb/kv-mem
Connect to a database using “memory” for an embedded instance:
use anyhow::Error; use surrealdb::engine::any::connect; async fn main() -> Result<(), Error> { let db = connect("memory").await?; Ok(()) }
Or another address if accessing a Cloud or local instance, such as:
// Cloud address let db = connect("wss://myinstance-06a4h41t12rtj7lsg45m3prm1k.aws-use1.surreal.cloud").await?; // Local address let db = connect("ws://localhost:8000").await?;
Then select a namespace and database name.
db.use_ns("ns").use_db("db").await?;
Create a table called document
to store documents and embeddings. The HNSW
index is one way to maintain performance if the dataset becomes quite large; otherwise, it can be left out. An MTREE index can also be used instead.
DEFINE TABLE document; DEFINE FIELD text ON document TYPE string; DEFINE FIELD embedding ON document TYPE array<float>; DEFINE INDEX hnsw_embed ON document FIELDS embedding HNSW DIMENSION 1024 DIST COSINE;
The size of the vector (1024 here) represents the number of dimensions in the embedding. This is to match Mistral AI’s mistral-embed
model, which uses 1024 as its length.
These statements can all be put intside a single .query()
call in the Rust SDK, followed by a line to check for any errors.
let mut res = db .query( "DEFINE TABLE document; DEFINE FIELD text ON document TYPE string; DEFINE FIELD embedding ON document TYPE array<float>; DEFINE INDEX hnsw_embed ON document FIELDS embedding HNSW DIMENSION 1024 DIST COSINE;", ) .await?; for (index, error) in res.take_errors() { println!("Error in query {index}: {error}"); }
At this point, you will need a key to interact with Mistral AI’s platform. They offer a free tier for experimentation, after which you will be able to create a key to interact with it via the code below.
The best way to set the key is as an environment variable, which we will set to be a static called KEY
. The client will look for one called MISTRAL_API_KEY
, though you can change this when setting up the Mistral AI Rust client if you like.
// Looks for MISTRAL_API_KEY let client = Client::new(Some(KEY.to_string()), None, None, None)?; // Looks for OTHER_ENV_VAR let client = Client::new(Some(KEY.to_string()), Some("OTHER_ENV_VAR".to_string()), None, None)?;
Using a LazyLock
will let us call it via std::env::var()
function the first time it is accessed. You can of course simply put it into a const
for simplicity when first testing, but always remember to never hard-code API keys in your code in production.
static KEY: LazyLock<String> = LazyLock::new(|| { std::env::var("MISTRAL_API_KEY").unwrap() });
And then run the code like this:
MISTRAL_API_KEY=whateverthekeyis cargo run
Or like this if you are using PowerShell on Windows.
$env:MISTRAL_API_KEY = "whateverthekeyis" cargo run
We will also create a const MODEL
to hold the Mistral AI model used, which in this case is an EmbedModel::MistralEmbed
.
const MODEL: EmbedModel = EmbedModel::MistralEmbed;
Inside main()
, create a client from the mistralai-client
crate.
let client = Client::new(Some(KEY.to_string()), None, None, None)?;
The client can be used to generate a Mistral AI embedding using the mistral-embed
model. Since SurrealDB uses the tokio runtime, the async .embeddings_async()
method will be used.
let input = vec!["Joram is the main character in the Darksword Trilogy.".to_string()]; let result = client.embeddings_async(MODEL, input, None).await?; println!("{:?}", result);
The output in your console should include an embedding 1024 floats in length.
The embeddings returned from Mistral AI can now be stored in the database. The response returned from the mistralai-client
crate looks like this, with a Vec
of EmbeddingResponseDataItem
structs that hold a Vec<f32>
.
pub struct EmbeddingResponse { pub id: String, pub object: String, pub model: EmbedModel, pub data: Vec<EmbeddingResponseDataItem>, pub usage: ResponseUsage, } pub struct EmbeddingResponseDataItem { pub index: u32, pub embedding: Vec<f32>, pub object: String, }
Using .remove(0)
will allow us to get the raw embeddings here. In a more complex response you might opt for a match on .get(0)
to handle any possible errors.
let embeds = result.data.remove(0).embedding;
There are a number of ways to work with or avoid structs when using the Rust SDK, including creating structs: one to represent the input into a .create()
statement, which will implement Serialize
, and another that implements Deserialize
to show the result.
struct DocumentInput { text: String, embedding: Vec<f32>, } struct Document { id: RecordId, embedding: Vec<f32>, text: String, }
This can be tested by printing out the created documents as a Document
struct.
let input = "Octopuses solve puzzles and escape enclosures, showing advanced intelligence."; let mut result = client .embeddings_async(MODEL, vec![input.to_string()], None) .await?; let embeds = result.data.remove(0).embedding; let in_db = db .create::<Option<Document>>("document") .content(DocumentInput { text: input.into(), embedding: embeds.to_vec(), }) .await?; println!("{in_db:?}");
We will now move the logic to create the embeddings into a function of its own. Since the embeddings_async()
method takes a single Vec<String>
, we’ll first clone it to keep the original Vec<String>
, then zip it together with the embeddings returned so that they can be put into the database along with the original input.
async fn create_embeds( input: Vec<String>, db: &Surreal<Any>, client: &Client, ) -> Result<(), Error> { let cloned = input.clone(); let embeds = client.embeddings_async(MODEL, input, None).await?; let zipped = cloned .into_iter() .zip(embeds.data.into_iter().map(|item| item.embedding)); for (text, embeds) in zipped { let _in_db = db .create::<Option<Document>>("document") .content(DocumentInput { text, embedding: embeds, }) .await?; } Ok(()) }
Then we’ll create four facts for each of four topics: sea creatures, Korean and Japanese cities, historical figures, and planets of the Solar System.
let embeds = [ "Octopuses solve puzzles and escape enclosures, showing advanced intelligence.", "Sharks exhibit learning behavior, but their intelligence is instinct-driven.", "Sea cucumbers lack a brain and show minimal cognitive response.", "Clams have simple nervous systems with no known intelligent behavior.", // "Seoul is South Korea’s capital and a global tech hub.", "Sejong is South Korea’s planned administrative capital.", "Busan a major South Korean port located in the far southeast.", "Tokyo is Japan’s capital, known for innovation and dense population.", // "Wilhelm II was Germany’s last Kaiser before World War I.", "Cyrus the Great founded the Persian Empire with tolerant rule.", "Napoleon Bonaparte was a French emperor and brilliant military strategist.", "Aristotle was a Greek philosopher who shaped Western intellectual thought.", // "Venus’s atmosphere ranges from scorching surface to Earth-like upper clouds.", "Mars has a thin, cold atmosphere with seasonal dust storms.", "Ceres has a tenuous exosphere with sporadic water vapor traces.", "Saturn’s atmosphere spans cold outer layers to a deep metallic hydrogen interior", ] .into_iter() .map(|s| s.to_string()) .collect::<Vec<String>>(); create_embeds(embeds, &db, &client).await?;
Finally let’s perform semantic search over the embeddings in our database. We’ll go with this query that uses the KNN operator to return the closest four matches to an embedding.
SELECT text, vector::distance::knn() AS distance FROM document WHERE embedding <|4,COSINE|> $embeds ORDER BY distance;
To use the HNSW index instead, just change the KNN operator from <|4,COSINE|>
to a number like <|4,40|>
. The 40 here represents the size of the dynamic candidate list used during the search.
SELECT text, vector::distance::knn() AS distance FROM document WHERE embedding <|4,40|> $embeds ORDER BY distance;
You can customise this with other algorithms such as Euclidean, Hamming, and so on.
We will then put this into a separate function called ask_question()
, which first prints out its input and then uses its embedding retrieved from Mistral AI to query the database against existing documents.
async fn ask_question(input: &str, db: &Surreal<Any>, client: &Client) -> Result<(), Error> { println!("{input}"); let embeds = client .embeddings_async(MODEL, vec![input.to_string()], None) .await? .data .remove(0) .embedding; let mut response = db.query("SELECT text, vector::distance::knn() AS distance FROM document WHERE embedding <|4,COSINE|> $embeds ORDER BY distance;").bind(("embeds", embeds)).await?; let as_val: Value = response.take(0)?; println!("{as_val}\n"); Ok(()) }
This function can now be called inside main()
to confirm that the results match with our expectations.
ask_question("Which Korean city is just across the sea from Japan?", &db, &client).await?; ask_question("Who was Germany's last Kaiser?", &db, &client).await?; ask_question("Which sea animal is most intelligent?", &db, &client).await?; ask_question("Which planet's atmosphere has a part with the same temperature as Earth?", &db, &client).await?;
Which Korean city is just across the sea from Japan? [{ distance: 0.19170371029549582f, text: 'Busan is a major South Korean port located in the far southeast.' }, { distance: 0.2399314515762122f, text: 'Tokyo is Japan’s capital, known for innovation and dense population.' }, { distance: 0.2443623703771407f, text: 'Sejong is South Korea’s planned administrative capital.' }, { distance: 0.24488082839731895f, text: 'Seoul is South Korea’s capital and a global tech hub.' }] Who was Germany's last Kaiser? [{ distance: 0.11228576780228805f, text: 'Wilhelm II was Germany’s last Kaiser before World War I.' }, { distance: 0.2957177300085634f, text: 'Napoleon Bonaparte was a French emperor and brilliant military strategist.' }, { distance: 0.34394473621670896f, text: 'Cyrus the Great founded the Persian Empire with tolerant rule.' }, { distance: 0.34911517400935843f, text: 'Sejong is South Korea’s planned administrative capital.' }] Which sea animal is most intelligent? [{ distance: 0.2342596053829904f, text: 'Octopuses solve puzzles and escape enclosures, showing advanced intelligence.' }, { distance: 0.24131327939924785f, text: 'Sharks exhibit learning behavior, but their intelligence is instinct-driven.' }, { distance: 0.2426242772516931f, text: 'Clams have simple nervous systems with no known intelligent behavior.' }, { distance: 0.24474598154128135f, text: 'Sea cucumbers lack a brain and show minimal cognitive response.' }] Which planet's atmosphere has a part with the same temperature as Earth? [{ distance: 0.20653440713083582f, text: 'Venus’s atmosphere ranges from scorching surface to Earth-like upper clouds.' }, { distance: 0.23354208810464594f, text: 'Mars has a thin, cold atmosphere with seasonal dust storms.' }, { distance: 0.24560810032473468f, text: 'Saturn’s atmosphere spans cold outer layers to a deep metallic hydrogen interior' }, { distance: 0.2761595357544341f, text: 'Ceres has a tenuous exosphere with sporadic water vapor traces.' }]
Here is the entire code:
use std::sync::LazyLock; use anyhow::Error; use mistralai_client::v1::{client::Client, constants::EmbedModel}; use serde::{Deserialize, Serialize}; use surrealdb::{ RecordId, Surreal, Value, engine::any::{Any, connect}, }; static KEY: LazyLock<String> = LazyLock::new(|| std::env::var("MISTRAL_API_KEY").unwrap()); // Experiment plan const MODEL: EmbedModel = EmbedModel::MistralEmbed; struct DocumentInput { text: String, embedding: Vec<f32>, } struct Document { id: RecordId, embedding: Vec<f32>, text: String, } async fn create_embeds( input: Vec<String>, db: &Surreal<Any>, client: &Client, ) -> Result<(), Error> { let cloned = input.clone(); let embeds = client.embeddings_async(MODEL, input, None).await?; let zipped = cloned .into_iter() .zip(embeds.data.into_iter().map(|item| item.embedding)); for (text, embeds) in zipped { let _in_db = db .create::<Option<Document>>("document") .content(DocumentInput { text, embedding: embeds, }) .await?; } Ok(()) } async fn ask_question(input: &str, db: &Surreal<Any>, client: &Client) -> Result<(), Error> { println!("{input}"); let embeds = client .embeddings_async(MODEL, vec![input.to_string()], None) .await? .data .remove(0) .embedding; let mut response = db.query("SELECT text, vector::distance::knn() AS distance FROM document WHERE embedding <|4,COSINE|> $embeds ORDER BY distance;").bind(("embeds", embeds)).await?; let as_val: Value = response.take(0)?; println!("{as_val}\n"); Ok(()) } async fn main() -> Result<(), Error> { let db = connect("memory").await.unwrap(); db.use_ns("ns").use_db("db").await.unwrap(); let mut res = db .query( "DEFINE TABLE document; DEFINE FIELD text ON document TYPE string; DEFINE FIELD embedding ON document TYPE array<float>; DEFINE INDEX hnsw_embed ON document FIELDS embedding HNSW DIMENSION 1024 DIST COSINE;", ) .await .unwrap(); for (index, error) in res.take_errors() { println!("Error in query {index}: {error}"); } let client = Client::new(Some(KEY.to_string()), None, None, None)?; let embeds = [ "Octopuses solve puzzles and escape enclosures, showing advanced intelligence.", "Sharks exhibit learning behavior, but their intelligence is instinct-driven.", "Sea cucumbers lack a brain and show minimal cognitive response.", "Clams have simple nervous systems with no known intelligent behavior.", // "Seoul is South Korea’s capital and a global tech hub.", "Sejong is South Korea’s planned administrative capital.", "Busan is a major South Korean port located in the far southeast.", "Tokyo is Japan’s capital, known for innovation and dense population.", // "Wilhelm II was Germany’s last Kaiser before World War I.", "Cyrus the Great founded the Persian Empire with tolerant rule.", "Napoleon Bonaparte was a French emperor and brilliant military strategist.", "Aristotle was a Greek philosopher who shaped Western intellectual thought.", // "Venus’s atmosphere ranges from scorching surface to Earth-like upper clouds.", "Mars has a thin, cold atmosphere with seasonal dust storms.", "Ceres has a tenuous exosphere with sporadic water vapor traces.", "Saturn’s atmosphere spans cold outer layers to a deep metallic hydrogen interior", ] .into_iter() .map(|s| s.to_string()) .collect::<Vec<String>>(); create_embeds(embeds, &db, &client).await?; ask_question("Which Korean city is just across the sea from Japan?", &db, &client).await?; ask_question("Who was Germany's last Kaiser?", &db, &client).await?; ask_question("Which sea animal is most intelligent?", &db, &client).await?; ask_question("Which planet's atmosphere has a part with the same temperature as Earth?", &db, &client).await?; Ok(()) }