Mistral

Python
Rust

Modern open-source RAG pipelines need two things:

High-quality embeddings – Mistral-Embed (mistral-embed) returns 1 024-dimensional float vectors that rival OpenAI + Cohere.
Blazing-fast vector storage – SurrealDB (≥ v1.5) ships an in-memory HNSW index, queried with the <|K,EF|> operator in SurrealQL.

Below you’ll wire them together, from install → ingestion → search → production-ready script.

Prerequisites

pip install mistralai surrealdb

Set two environment variables (or hard-code them if you must):

export SDB_URL="http://localhost:8000/rpc"   # ← SurrealDB RPC endpoint
export MISTRAL_API_KEY="sk-…"                # ← your Mistral key

Connect and create the schema

from mistralai.client import MistralClient
from surrealdb import Surreal
import os, asyncio

# ----- 1.1 · Config -----------------------------------------------------------------
SDB_URL  = os.getenv("SDB_URL", "http://localhost:8000/rpc")
SDB_USER = os.getenv("SDB_USER", "root")
SDB_PASS = os.getenv("SDB_PASS", "secret")
NS, DB   = "demo", "demo"
TABLE    = "mistral_docs"
MODEL    = "mistral-embed"

# ----- 1.2 · Clients ----------------------------------------------------------------
sdb   = Surreal(SDB_URL)
mistr = MistralClient(api_key=os.environ["MISTRAL_API_KEY"])

async def init_db():
    await sdb.signin({"user": SDB_USER, "pass": SDB_PASS})
    await sdb.use(NS, DB)

    # one quick embedding → get true vector dimension
    dim = len(mistr.embeddings(model=MODEL, input=["ping"]).data[0].embedding)

    schema = """
    DEFINE TABLE $tb SCHEMALESS PERMISSIONS NONE;
    DEFINE FIELD text      ON $tb TYPE string;
    DEFINE FIELD embedding ON $tb TYPE array;

    DEFINE INDEX hnsw_idx ON $tb
      FIELDS embedding
      HNSW DIMENSION {dim}
      DIST   COSINE;
    """
    await sdb.query(schema, {"tb": TABLE})

asyncio.run(init_db())

Why detect the dimension dynamically?

Future-proofing: if Mistral introduces a small or large Embed model with a different dimension, the code auto-adapts.

Embed and bulk-insert documents

DOCS = [
    "SurrealDB offers an in-memory HNSW vector index for low-latency search.",
    "Mistral-Embed produces 1 024-dimensional embeddings.",
    "You can build a completely open-source RAG stack with these two tools.",
]

async def insert_docs(docs, batch=64):
    rows = []
    for i in range(0, len(docs), batch):
        chunk = docs[i : i + batch]
        vecs  = mistr.embeddings(model=MODEL, input=chunk).data
        rows += [
            {
                "id":        f"{TABLE}:{i+j}",
                "text":      chunk[j],
                "embedding": vec.embedding,
            }
            for j, vec in enumerate(vecs)
        ]
    await sdb.query(f"INSERT INTO {TABLE} $data", {"data": rows})

asyncio.run(insert_docs(DOCS))

Why bulk-insert? One SurrealQL call → one network round-trip — much faster than inserting row-by-row.

Search with a natural-language query

async def search(query: str, k: int = 3, ef: int = 64):
    q_vec = mistr.embeddings(model=MODEL, input=[query]).data[0].embedding
    surql = """
    LET $q := $vec;
    SELECT id, text, vector::distance::knn() AS score
    FROM $tb
    WHERE embedding <|{k},{ef}|> $q
    ORDER BY score;
    """
    res = await sdb.query(surql, {"vec": q_vec, "tb": TABLE})
    return res[0].result

hits = asyncio.run(search("Which database supports native vector search?"))
for h in hits:
    print(f"⭐ {h['text']}  (score={h['score']:.4f})")

<|K,EF|> activates the HNSW K-nearest-neighbour operator (K=3, efSearch=64). vector::distance::knn() exposes the cosine distance already computed inside the index—no post-processing needed.

Full script (ready to run)

# mistral_surreal_demo.py
from __future__ import annotations
import os, asyncio
from mistralai.client import MistralClient
from surrealdb import Surreal

SDB_URL  = os.getenv("SDB_URL", "http://localhost:8000/rpc")
SDB_USER = os.getenv("SDB_USER", "root")
SDB_PASS = os.getenv("SDB_PASS", "secret")
NS, DB, TABLE = "demo", "demo", "mistral_docs"
MODEL   = "mistral-embed"
KEY     = os.environ["MISTRAL_API_KEY"]  # export first!

sdb, mistr = Surreal(SDB_URL), MistralClient(api_key=KEY)

DOCS = [
    "SurrealDB's vector index is built on HNSW.",
    "Mistral-Embed vectors offer strong semantic quality.",
    "Together they form a fast, open-source search stack.",
]

async def main():
    await sdb.signin({"user": SDB_USER, "pass": SDB_PASS})
    await sdb.use(NS, DB)

    dim = len(mistr.embeddings(model=MODEL, input=["x"]).data[0].embedding)
    await sdb.query("""
        DEFINE TABLE $tb SCHEMALESS PERMISSIONS NONE;
        DEFINE FIELD text ON $tb TYPE string;
        DEFINE FIELD embedding ON $tb TYPE array;
        DEFINE INDEX hnsw_idx ON $tb FIELDS embedding
               HNSW DIMENSION {dim} DIST COSINE;
    """, {"tb": TABLE})

    # ingest if empty
    if (await sdb.query(f"SELECT count() FROM {TABLE};"))[0].result[0]["count"] == 0:
        rows = []
        vecs = mistr.embeddings(model=MODEL, input=DOCS).data
        rows = [
            {"id": f"{TABLE}:{i}", "text": DOCS[i], "embedding": v.embedding}
            for i, v in enumerate(vecs)
        ]
        await sdb.query(f"INSERT INTO {TABLE} $data", {"data": rows})

    # search
    q_vec = mistr.embeddings(model=MODEL,
                             input=["open-source vector database"] ).data[0].embedding
    res = await sdb.query("""
        LET $q := $vec;
        SELECT text, vector::distance::knn() AS score
        FROM {TABLE}
        WHERE embedding <|3,64|> $q
        ORDER BY score;
    """, {"vec": q_vec})
    print(res[0].result)

if __name__ == "__main__":
    asyncio.run(main())

About quantisation

SurrealDB currently stores vectors as float32 / float64 arrays and does not ship built-in binary or int8 quantisation. If memory is critical you can:

Quantise offline to int8 (e.g. with faiss or sentence-transformers).
Store the int8 arrays in another field (SurrealQL’s array type is agnostic).
Execute a two-stage search: coarse K-NN on the int8 field, then rescore on the full-precision field.

You’re done 🚀

You now have a clean, fully-async SurrealDB setup that stores Mistral-Embed vectors, supports fast HNSW search, and can be dropped into any RAG or semantic-search workflow.

Setup

Create a new Cargo project with cargo new project_name and go into the project folder, then add the following dependencies inside Cargo.toml:

anyhow = "1.0.98"
mistralai-client = "0.14.0"
serde = "1.0.219"
surrealdb = { version = "2.3", features = ["kv-mem"] }
tokio = "1.45.0"

You can add the same dependencies on the command line through a single command:

cargo add anyhow mistralai-client serde tokio surrealdb --features surrealdb/kv-mem

Connect to a database using “memory” for an embedded instance:

use anyhow::Error;
use surrealdb::engine::any::connect;

#[tokio::main]
async fn main() -> Result<(), Error> {
    let db = connect("memory").await?;
    Ok(())
}

Or another address if accessing a Cloud or local instance, such as:

// Cloud address
let db = connect("wss://myinstance-06a4h41t12rtj7lsg45m3prm1k.aws-use1.surreal.cloud").await?;

// Local address
let db = connect("ws://localhost:8000").await?;

Then select a namespace and database name.

db.use_ns("ns").use_db("db").await?;

Create a vector table and index

Create a table called document to store documents and embeddings. The HNSW index is one way to maintain performance if the dataset becomes quite large; otherwise, it can be left out.

DEFINE TABLE document;
DEFINE FIELD text ON document TYPE string;
DEFINE FIELD embedding ON document TYPE array<float>;
DEFINE INDEX hnsw_embed ON document FIELDS embedding HNSW DIMENSION 1024 DIST COSINE;

The size of the vector (1024 here) represents the number of dimensions in the embedding. This is to match Mistral AI’s mistral-embed model, which uses 1024 as its length.

These statements can all be put intside a single .query() call in the Rust SDK, followed by a line to check for any errors.

let mut res = db
    .query(
        "DEFINE TABLE document;
DEFINE FIELD text ON document TYPE string;
DEFINE FIELD embedding ON document TYPE array<float>;
DEFINE INDEX hnsw_embed ON document FIELDS embedding HNSW DIMENSION 1024 DIST COSINE;",
    )
    .await?;
for (index, error) in res.take_errors() {
    println!("Error in query {index}: {error}");
}

Generate Mistral AI embeddings

At this point, you will need a key to interact with Mistral AI’s platform. They offer a free tier for experimentation, after which you will be able to create a key to interact with it via the code below.

The best way to set the key is as an environment variable, which we will set to be a static called KEY. The client will look for one called MISTRAL_API_KEY, though you can change this when setting up the Mistral AI Rust client if you like.

// Looks for MISTRAL_API_KEY
let client = Client::new(Some(KEY.to_string()), None, None, None)?;
// Looks for OTHER_ENV_VAR
let client = Client::new(Some(KEY.to_string()), Some("OTHER_ENV_VAR".to_string()), None, None)?;

Using a LazyLock will let us call it via std::env::var() function the first time it is accessed. You can of course simply put it into a const for simplicity when first testing, but always remember to never hard-code API keys in your code in production.

static KEY: LazyLock<String> = LazyLock::new(|| {
    std::env::var("MISTRAL_API_KEY").unwrap()
});

And then run the code like this:

MISTRAL_API_KEY=whateverthekeyis cargo run

Or like this if you are using PowerShell on Windows.

$env:MISTRAL_API_KEY = "whateverthekeyis"
cargo run

We will also create a const MODEL to hold the Mistral AI model used, which in this case is an EmbedModel::MistralEmbed.

const MODEL: EmbedModel = EmbedModel::MistralEmbed;

Inside main(), create a client from the mistralai-client crate.

let client = Client::new(Some(KEY.to_string()), None, None, None)?;

The client can be used to generate a Mistral AI embedding using the mistral-embed model. Since SurrealDB uses the tokio runtime, the async .embeddings_async() method will be used.

let input = vec!["Joram is the main character in the Darksword Trilogy.".to_string()];

let result = client.embeddings_async(MODEL, input, None).await?;
println!("{:?}", result);

The output in your console should include an embedding 1024 floats in length.

Store embeddings in database

The embeddings returned from Mistral AI can now be stored in the database. The response returned from the mistralai-client crate looks like this, with a Vec of EmbeddingResponseDataItem structs that hold a Vec<f32>.

pub struct EmbeddingResponse {
    pub id: String,
    pub object: String,
    pub model: EmbedModel,
    pub data: Vec<EmbeddingResponseDataItem>,
    pub usage: ResponseUsage,
}

pub struct EmbeddingResponseDataItem {
    pub index: u32,
    pub embedding: Vec<f32>,
    pub object: String,
}

Using .remove(0) will allow us to get the raw embeddings here. In a more complex response you might opt for a match on .get(0) to handle any possible errors.

let embeds = result.data.remove(0).embedding;

There are a number of ways to work with or avoid structs when using the Rust SDK, including creating structs: one to represent the input into a .create() statement, which will implement Serialize, and another that implements Deserialize to show the result.

#[derive(Serialize)]
struct DocumentInput {
    text: String,
    embedding: Vec<f32>,
}

#[derive(Debug, Deserialize)]
struct Document {
    id: RecordId,
    embedding: Vec<f32>,
    text: String,
}

This can be tested by printing out the created documents as a Document struct.

let input = "Octopuses solve puzzles and escape enclosures, showing advanced intelligence.";

let mut result = client
    .embeddings_async(MODEL, vec![input.to_string()], None)
    .await?;
let embeds = result.data.remove(0).embedding;
let in_db = db
    .create::<Option<Document>>("document")
    .content(DocumentInput {
        text: input.into(),
        embedding: embeds.to_vec(),
    })
    .await?;
println!("{in_db:?}");

We will now move the logic to create the embeddings into a function of its own. Since the embeddings_async() method takes a single Vec<String>, we’ll first clone it to keep the original Vec<String>, then zip it together with the embeddings returned so that they can be put into the database along with the original input.

async fn create_embeds(
    input: Vec<String>,
    db: &Surreal<Any>,
    client: &Client,
) -> Result<(), Error> {
    let cloned = input.clone();
    let embeds = client.embeddings_async(MODEL, input, None).await?;
    let zipped = cloned
        .into_iter()
        .zip(embeds.data.into_iter().map(|item| item.embedding));

    for (text, embeds) in zipped {
        let _in_db = db
            .create::<Option<Document>>("document")
            .content(DocumentInput {
                text,
                embedding: embeds,
            })
            .await?;
    }
    Ok(())
}

Then we’ll create four facts for each of four topics: sea creatures, Korean and Japanese cities, historical figures, and planets of the Solar System.

let embeds = [
    "Octopuses solve puzzles and escape enclosures, showing advanced intelligence.",
    "Sharks exhibit learning behavior, but their intelligence is instinct-driven.",
    "Sea cucumbers lack a brain and show minimal cognitive response.",
    "Clams have simple nervous systems with no known intelligent behavior.",
    //
    "Seoul is South Korea’s capital and a global tech hub.",
    "Sejong is South Korea’s planned administrative capital.",
    "Busan a major South Korean port located in the far southeast.",
    "Tokyo is Japan’s capital, known for innovation and dense population.",
    //
    "Wilhelm II was Germany’s last Kaiser before World War I.",
    "Cyrus the Great founded the Persian Empire with tolerant rule.",
    "Napoleon Bonaparte was a French emperor and brilliant military strategist.",
    "Aristotle was a Greek philosopher who shaped Western intellectual thought.",
    //
    "Venus’s atmosphere ranges from scorching surface to Earth-like upper clouds.",
    "Mars has a thin, cold atmosphere with seasonal dust storms.",
    "Ceres has a tenuous exosphere with sporadic water vapor traces.",
    "Saturn’s atmosphere spans cold outer layers to a deep metallic hydrogen interior",
]
.into_iter()
.map(|s| s.to_string())
.collect::<Vec<String>>();

create_embeds(embeds, &db, &client).await?;

Semantic search

Finally let’s perform semantic search over the embeddings in our database. We’ll go with this query that uses the KNN operator to return the closest four matches to an embedding.

SELECT 
    text,
    vector::distance::knn() AS distance FROM document
    WHERE embedding <|4,COSINE|> $embeds
    ORDER BY distance;

To use the HNSW index instead, just change the KNN operator from <|4,COSINE|> to a number like <|4,40|>. The 40 here represents the size of the dynamic candidate list used during the search.

SELECT 
    text,
    vector::distance::knn() AS distance FROM document
    WHERE embedding <|4,40|> $embeds
    ORDER BY distance;

You can customise this with other algorithms such as Euclidean, Hamming, and so on.

We will then put this into a separate function called ask_question(), which first prints out its input and then uses its embedding retrieved from Mistral AI to query the database against existing documents.

async fn ask_question(input: &str, db: &Surreal<Any>, client: &Client) -> Result<(), Error> {
    println!("{input}");
    let embeds = client
        .embeddings_async(MODEL, vec![input.to_string()], None)
        .await?
        .data
        .remove(0)
        .embedding;

    let mut response = db.query("SELECT text, vector::distance::knn() AS distance FROM document WHERE embedding <|4,COSINE|> $embeds ORDER BY distance;").bind(("embeds", embeds)).await?;
    let as_val: Value = response.take(0)?;
    println!("{as_val}\n");
    Ok(())
}

This function can now be called inside main() to confirm that the results match with our expectations.

ask_question("Which Korean city is just across the sea from Japan?", &db, &client).await?;
ask_question("Who was Germany's last Kaiser?", &db, &client).await?;
ask_question("Which sea animal is most intelligent?", &db, &client).await?;
ask_question("Which planet's atmosphere has a part with the same temperature as Earth?", &db, &client).await?;

Which Korean city is just across the sea from Japan?
[{ distance: 0.19170371029549582f, text: 'Busan is a major South Korean port located in the far southeast.' }, { distance: 0.2399314515762122f, text: 'Tokyo is Japan’s capital, known for innovation and dense population.' }, { distance: 0.2443623703771407f, text: 'Sejong is South Korea’s planned administrative capital.' }, { distance: 0.24488082839731895f, text: 'Seoul is South Korea’s capital and a global tech hub.' }]

Who was Germany's last Kaiser?
[{ distance: 0.11228576780228805f, text: 'Wilhelm II was Germany’s last Kaiser before World War I.' }, { distance: 0.2957177300085634f, text: 'Napoleon Bonaparte was a French emperor and brilliant military strategist.' }, { distance: 0.34394473621670896f, text: 'Cyrus the Great founded the Persian Empire with tolerant rule.' }, { distance: 0.34911517400935843f, text: 'Sejong is South Korea’s planned administrative capital.' }]

Which sea animal is most intelligent?
[{ distance: 0.2342596053829904f, text: 'Octopuses solve puzzles and escape enclosures, showing advanced intelligence.' }, { distance: 0.24131327939924785f, text: 'Sharks exhibit learning behavior, but their intelligence is instinct-driven.' }, { distance: 0.2426242772516931f, text: 'Clams have simple nervous systems with no known intelligent behavior.' }, { distance: 0.24474598154128135f, text: 'Sea cucumbers lack a brain and show minimal cognitive response.' }]

Which planet's atmosphere has a part with the same temperature as Earth?
[{ distance: 0.20653440713083582f, text: 'Venus’s atmosphere ranges from scorching surface to Earth-like upper clouds.' }, { distance: 0.23354208810464594f, text: 'Mars has a thin, cold atmosphere with seasonal dust storms.' }, { distance: 0.24560810032473468f, text: 'Saturn’s atmosphere spans cold outer layers to a deep metallic hydrogen interior' }, { distance: 0.2761595357544341f, text: 'Ceres has a tenuous exosphere with sporadic water vapor traces.' }]

Here is the entire code:

use std::sync::LazyLock;

use anyhow::Error;
use mistralai_client::v1::{client::Client, constants::EmbedModel};
use serde::{Deserialize, Serialize};
use surrealdb::{
    RecordId, Surreal, Value,
    engine::any::{Any, connect},
};

static KEY: LazyLock<String> = LazyLock::new(|| std::env::var("MISTRAL_API_KEY").unwrap());

// Experiment plan
const MODEL: EmbedModel = EmbedModel::MistralEmbed;

#[derive(Serialize)]
struct DocumentInput {
    text: String,
    embedding: Vec<f32>,
}

#[derive(Debug, Deserialize)]
struct Document {
    id: RecordId,
    embedding: Vec<f32>,
    text: String,
}

async fn create_embeds(
    input: Vec<String>,
    db: &Surreal<Any>,
    client: &Client,
) -> Result<(), Error> {
    let cloned = input.clone();
    let embeds = client.embeddings_async(MODEL, input, None).await?;
    let zipped = cloned
        .into_iter()
        .zip(embeds.data.into_iter().map(|item| item.embedding));

    for (text, embeds) in zipped {
        let _in_db = db
            .create::<Option<Document>>("document")
            .content(DocumentInput {
                text,
                embedding: embeds,
            })
            .await?;
    }
    Ok(())
}

async fn ask_question(input: &str, db: &Surreal<Any>, client: &Client) -> Result<(), Error> {
    println!("{input}");
    let embeds = client
        .embeddings_async(MODEL, vec![input.to_string()], None)
        .await?
        .data
        .remove(0)
        .embedding;

    let mut response = db.query("SELECT text, vector::distance::knn() AS distance FROM document WHERE embedding <|4,COSINE|> $embeds ORDER BY distance;").bind(("embeds", embeds)).await?;
    let as_val: Value = response.take(0)?;
    println!("{as_val}\n");
    Ok(())
}

#[tokio::main]
async fn main() -> Result<(), Error> {
    let db = connect("memory").await.unwrap();

    db.use_ns("ns").use_db("db").await.unwrap();

    let mut res = db
        .query(
            "DEFINE TABLE document;
             DEFINE FIELD text ON document TYPE string;
             DEFINE FIELD embedding ON document TYPE array<float>;
             DEFINE INDEX hnsw_embed ON document FIELDS embedding HNSW DIMENSION 1024 DIST COSINE;",
        )
        .await
        .unwrap();
    for (index, error) in res.take_errors() {
        println!("Error in query {index}: {error}");
    }

    let client = Client::new(Some(KEY.to_string()), None, None, None)?;

    let embeds = [
        "Octopuses solve puzzles and escape enclosures, showing advanced intelligence.",
        "Sharks exhibit learning behavior, but their intelligence is instinct-driven.",
        "Sea cucumbers lack a brain and show minimal cognitive response.",
        "Clams have simple nervous systems with no known intelligent behavior.",
        //
        "Seoul is South Korea’s capital and a global tech hub.",
        "Sejong is South Korea’s planned administrative capital.",
        "Busan is a major South Korean port located in the far southeast.",
        "Tokyo is Japan’s capital, known for innovation and dense population.",
        //
        "Wilhelm II was Germany’s last Kaiser before World War I.",
        "Cyrus the Great founded the Persian Empire with tolerant rule.",
        "Napoleon Bonaparte was a French emperor and brilliant military strategist.",
        "Aristotle was a Greek philosopher who shaped Western intellectual thought.",
        //
        "Venus’s atmosphere ranges from scorching surface to Earth-like upper clouds.",
        "Mars has a thin, cold atmosphere with seasonal dust storms.",
        "Ceres has a tenuous exosphere with sporadic water vapor traces.",
        "Saturn’s atmosphere spans cold outer layers to a deep metallic hydrogen interior",
    ]
    .into_iter()
    .map(|s| s.to_string())
    .collect::<Vec<String>>();

    create_embeds(embeds, &db, &client).await?;

    ask_question("Which Korean city is just across the sea from Japan?", &db, &client).await?;
    ask_question("Who was Germany's last Kaiser?", &db, &client).await?;
    ask_question("Which sea animal is most intelligent?", &db, &client).await?;
    ask_question("Which planet's atmosphere has a part with the same temperature as Earth?", &db, &client).await?;

    Ok(())
}

Edit this page on GitHub

Quick start with Rust Fastembed

Prerequisites
Connect and create the schema
Why detect the dimension dynamically?
Embed and bulk-insert documents
Search with a natural-language query
Full script (ready to run)
About quantisation
You’re done 🚀
Setup
Create a vector table and index
Generate Mistral AI embeddings
Store embeddings in database
Semantic search

Last updated: Loading...