Vector indexes | SurrealDB Docs

Choose brute force, HNSW, or DISKANN vector indexes, tune DIMENSION and distance metrics, and use the query cheat sheet with vector::distance::knn().

When it comes to search, you can always use brute force.

In SurrealDB, you can use the brute force approach to search through your vector embeddings and data.

Brute force search compares a query vector against all vectors in the dataset to find the closest match. As this is a brute-force approach, you do not create an index for this approach.

The brute force approach for finding the nearest neighbour is generally preferred in the following use cases:

Small datasets / limited query vectors: For applications with small datasets, the overhead of building and maintaining an index might outweigh its benefits. In such cases, the brute force approach is optimal.
Guaranteed accuracy: Since the brute force method compares the query vector against every vector in the dataset, it guarantees finding the exact nearest vectors based on the chosen distance metric (like Euclidean, Manhattan, etc.).
Benchmarking models: The brute force approach can be used as a reference to help benchmark the performance of other approximate alternatives like HNSW or DISKANN.

While brute force can give you exact results, it's computationally expensive for large datasets.

In most cases, you do not need a 100% exact match, and you can give it up for faster, high-dimensional searches to find the approximate nearest neighbour to a query vector.

This is where vector indexes come in.

HNSW and DISKANN

SurrealDB offers two approximate graph indexes for k-nearest-neighbour search:

Index	Best for	Storage
HNSW (`DEFINE INDEX … HNSW`)	Low-latency ANN when the graph fits comfortably in memory with headroom for the bounded vector cache	In-memory hot graph + persistence
DISKANN (`DEFINE INDEX … DISKANN`) (SurrealDB 3.1+)	Very large corpora where RAM cannot hold the full graph — optimises for disk-resident graphs and quantisation-friendly types	On-disk graph with caching (not on WASM targets)

Both are proximity graph-style indexes. Queries use the same <|K, …|> KNN operator shapes; the optimiser picks the index when distances and types line up.

Vector search cheat sheet

HNSW — efficient in-memory approximation for high dimensions or large in-RAM datasets.
DISKANN — disk-oriented approximation for embeddings that exceed practical memory for a pure HNSW graph.
Brute force — when you do not define an index, when you want exact nearest neighbours, or when you pass an explicit distance function to the query that does not route to your index.

HNSW index

Parameter	Default	Options	Description
DIMENSION	Size of the vector
DIST	EUCLIDEAN	EUCLIDEAN, COSINE, MANHATTAN	Distance function
TYPE	F64	F64, F32, I64, I32, I16	Vector type
EFC	150	EF construction
M	12	Max connections per element
M0	24	Max connections in the lowest layer
LM	0.40242960438184466f	Multiplier for level generation. This value is automatically calculated with a value considered as optimal.

Examples:

-- User statement:
DEFINE INDEX hnsw_idx ON pts FIELDS point HNSW DIMENSION 4;
-- Defaults to:
DEFINE INDEX hnsw_idx ON pts FIELDS point HNSW DIMENSION 4 DIST EUCLIDEAN TYPE F64 EFC 150 M 12 M0 24 LM 0.40242960438184466f;
-- Users are strongly suggested not to set an LM value, as
-- it is computed based on other parameters. Only users
-- completely versed in the field should manually set it

For more details, see the DEFINE INDEX statement documentation.

DISKANN index

Available since: v3.1.0

Parameter	Default	Options	Description
DIMENSION	Vector dimension
DIST	EUCLIDEAN	EUCLIDEAN, COSINE, INNER_PRODUCT, COSINE_NORMALIZED	Distance (narrower set than HNSW)
TYPE	F32	F32, F16, I8, U8	Element encoding (`COSINE_NORMALIZED` requires `F32` or `F16`)
DEGREE	64	> 0	Target maximum graph degree
L_BUILD	100	> 0	Construction search-list size
ALPHA	1.2	DiskANN pruning parameter
HASHED_VECTOR	off	Optional hash-stabilised vector keys

DEFINE INDEX diskann_idx ON pts FIELDS point DISKANN DIMENSION 4 DIST COSINE TYPE F32;

See DEFINE INDEX → DISKANN for platform notes (including no WASM support).

Querying

DEFINE INDEX hnsw_idx ON pts FIELDS point HNSW DIMENSION 4;

LET $vector = [2,3,4];
SELECT
    id,
    vector::distance::knn() as dist  -- distance from $vector
                                     -- knn reuses the value computed during
                                     -- the query, in this case the euclidean
                                     -- distance
FROM pts
WHERE point
    <|2|>  -- return 2, in this case using the distance function defined in the
           -- index: euclidean
    $vector;

With a DISKANN index defined on point, the same <|K, EF|> approximate form applies; the second number bounds the dynamic candidate list for search (see the KNN operator).

Functions
`vector::distance::knn()`	reuses the value computed during the query
`vector::distance::chebyshev(point, $vec)`
`vector::distance::euclidean(point, $vec)`
`vector::distance::hamming(point, $vec)`
`vector::distance::manhattan(point, $vec)`
`vector::distance::minkowski(point, $vec, 3)`	third param is 𝑝
`vector::similarity::cosine(point, $vec)`
`vector::similarity::jaccard(point, $vec)`
`vector::similarity::pearson(point, $vec)`

WHERE statement

Query	HNSW index	DISKANN index
`<\\|2\\|>`	uses distance function defined in index	same when the index distance matches
`<\\|2, EUCLIDEAN\\|>`	brute force method	brute force method
`<\\|2, COSINE\\|>`	brute force method	brute force method
`<\\|2, MANHATTAN\\|>`	brute force method	brute force method
`<\\|2, MINKOWSKI, 3\\|>`	brute force method (third param is 𝑝)	brute force method
`<\\|2, CHEBYSHEV\\|>`	brute force method	brute force method
`<\\|2, HAMMING\\|>`	brute force method	brute force method
`<\\|2, 10\\|>`	second param is effort*	same approximate form — second value bounds the candidate list

\* effort — for HNSW and DISKANN, the second number in <|K, N|> tells the engine how far to search along the graph. Both algorithms are approximate and may miss some vectors.

Notes

Verify index utilisation in queries using the EXPLAIN FULL clause. E.g: SELECT id FROM pts WHERE point <|10|> [2,3,4,5] EXPLAIN FULL;
𝑝 values: (more about 𝑝 in Minkowski distance)
- 2⁰ = 1 → manhattan/diamond ◇
- 2¹ = 2 → euclidean/circle ○
- 2² = 4 → squircle ▢
- 2^∞ = ∞ → square □