These functions are used in conjunction with the @@
operator (the ‘matches’ operator) to either collect the relevance score or highlight the searched keywords within the content.
Function | Description |
---|---|
search::analyze() | Returns the output of a defined search analyzer | search::highlight() | Highlights the matching keywords |
search::linear() | Available since: v3.0.0-alpha.8 Performs weighted linear search | search::offsets() | Returns the position of the matching keywords |
search::rrf() | Available since: v3.0.0-alpha.8 Performs RRF (reciprocal rank fusion) search |
search::score() | Returns the relevance score |
NoteBefore SurrealDB version 3.0.0-alpha.8, the
FULLTEXT ANALYZER
clause used the syntaxSEARCH ANALYZER
.
The examples below assume the following queries:
CREATE book:1 SET title = "Rust Web Programming"; DEFINE ANALYZER book_analyzer TOKENIZERS blank, class, camel, punct FILTERS snowball(english); DEFINE INDEX book_title ON book FIELDS title FULLTEXT ANALYZER book_analyzer BM25;
search::analyze
The search_analyze
function returns the outut of a defined search analyzer on an input string.
API DEFINITIONsearch::analyze(analyzer, string) -> array<string>
First define the analyzer using the DEFINE ANALYZER
statement
Define book analyzerDEFINE ANALYZER book_analyzer TOKENIZERS blank, class, camel, punct FILTERS snowball(english);
Next you can pass the analyzer to the search::analyze
function. The following example shows this function, and its output, when used in a RETURN
statement:
RETURN search::analyze("book_analyzer", "A hands-on guide to developing, packaging, and deploying fully functional Rust web applications");
Output[ 'a', 'hand', '-', 'on', 'guid', 'to', 'develop', ',', 'packag', ',', 'and', 'deploy', 'fulli', 'function', 'rust', 'web', 'applic' ]
search::highlight
The search::highlight
function highlights the matching keywords for the predicate reference number.
API DEFINITIONsearch::highlight(string, string, number, [boolean]) -> string | string[]
The following example shows this function, and its output, when used in a RETURN
statement:
SELECT id, search::highlight('<b>', '</b>', 1) AS title FROM book WHERE title @1@ 'rust web';
Output[ { id: book:1, title: [ '<b>Rust</b> <b>Web</b> Programming' ] } ]
The optional Boolean parameter can be set to true
to explicitly request that the whole found term be highlighted, or set to false
to highlight only the sequence of characters we are looking for. This must be used with an edgengram
or ngram
filter. The default value is true.
search::linear
API DEFINITIONsearch::linear(lists: array, weights: array, limit: int, norm: 'minmax' | 'zscore') -> array<object>
Notes on the arguments and output of this function:
lists
- array of result arrays. Each inner array must be pre‑sorted most‑relevant‑first (BM25 score descending, distance ascending already inverted, etc.).weights
- An array of numeric weights corresponding to each result(must have same length as results)limit
- Maximum number of documents to return (must be ≥ 1)norm
- Normalization method: “minmax” for MinMax normalization or “zscore” for Z-score normalizationdistance
field - converted using 1.0 / (1.0 + distance)
(lower distance = higher score)ft_score
field - used directly (full-text search scores)score
field - used directly (generic scores)1.0 / (1.0 + rank)
if no score field is found(score - min) / (max - min)
(score - mean) / std_dev
linear_score
descending and truncates to limit.
-- Sample data -- CREATE test:1 SET text = "Graph databases are great.", embedding = [0.10, 0.20, 0.30]; CREATE test:2 SET text = "Relational databases store tables.", embedding = [0.05, 0.10, 0.00]; CREATE test:3 SET text = "This document mentions graphs and networks.", embedding = [0.20, 0.10, 0.25]; -- Analyzer used by the full‑text index DEFINE ANALYZER simple TOKENIZERS class, punct FILTERS lowercase, ascii; -- Full‑text index DEFINE INDEX idx_text ON TABLE test FIELDS text FULLTEXT ANALYZER simple BM25; -- Vector index (HNSW) on a 3‑dimensional embedding, using cosine distance DEFINE INDEX idx_embedding ON TABLE test FIELDS embedding HNSW DIMENSION 3 DIST COSINE; -- Query vector (whatever your embedding model produced for "graph databases") LET $qvec = [0.12, 0.18, 0.27]; -- Vector search: top 2 nearest neighbours LET $vs = SELECT id FROM test WHERE embedding <|2,100|> $qvec; -- Full‑text search: top 2 lexical matches LET $ft = SELECT id, search::score(1) as score FROM test WHERE text @1@ 'graph' ORDER BY score DESC LIMIT 2; -- Fuse with Linear / minmax search::linear([$vs, $ft], [2, 1], 2, 'minmax'); -- Fuse with Linear / zscore search::linear([$vs, $ft], [2, 1], 2, 'zscore');
Output of the final search::linear() queries:
-------- Query 1 -------- [ { distance: 0.0034969844824588314f, ft_score: 0.5366538763046265f, id: test:1, linear_score: 2 }, { distance: 0.056393806565797844f, id: test:3, linear_score: 0 } ] -------- Query 2 -------- [ { distance: 0.0034969844824588314f, ft_score: 0.5366538763046265f, id: test:1, linear_score: 1.9999999999999956f }, { distance: 0.056393806565797844f, id: test:3, linear_score: -2.0000000000000044f } ]
search::offsets
The search::offsets
function returns the position of the matching keywords for the predicate reference number.
API DEFINITIONsearch::offsets(number, [boolean]) -> object
The following example shows this function, and its output, when used in a RETURN
statement:
SELECT id, title, search::offsets(1) AS title_offsets FROM book WHERE title @1@ 'rust web';
Output[ { id: book:1, title: [ 'Rust Web Programming' ], title_offsets: { 0: [ { e: 4, s: 0 }, { e: 8, s: 5 } ] } } ]
The output returns the start s
and end e
positions of each matched term found within the original field.
The full-text index is capable of indexing both single strings and arrays of strings. In this example, the key 0
indicates that we’re highlighting the first string within the title
field, which contains an array of strings.
The optional Boolean parameter can be set to true
to explicitly request that the whole found term be highlighted, or set to false
to highlight only the sequence of characters we are looking for. This must be used with an edgengram
or ngram
filter.
The default value is true.
search::rrf
API DEFINITIONsearch::rrf(lists: array, limit: int, k: option<int>) -> array<object>
Notes on the arguments and output of this function:
See this paper for why 60 tends to be the default k
value:
Our intuition in choosing this formula derived from fact that while highly-ranked documents are more important, the importance of lower-ranked documents does not vanish as it would were, say, an exponential function used. The constant
k
mitigates the impact of high rankings by outlier systems.
rff_score = Σ 1/(k + rank)
.rff_score
descending and truncates to limit.
-- Sample data -- CREATE test:1 SET text = "Graph databases are great.", embedding = [0.10, 0.20, 0.30]; CREATE test:2 SET text = "Relational databases store tables.", embedding = [0.05, 0.10, 0.00]; CREATE test:3 SET text = "This document mentions graphs.", embedding = [0.20, 0.10, 0.25]; -- Analyzer used by the full‑text index DEFINE ANALYZER simple TOKENIZERS class, punct FILTERS lowercase, ascii; -- Full‑text index DEFINE INDEX idx_text ON TABLE test FIELDS text FULLTEXT ANALYZER simple BM25; -- Vector index (HNSW) on a 3‑dimensional embedding, using cosine distance DEFINE INDEX idx_embedding ON TABLE test FIELDS embedding HNSW DIMENSION 3 DIST COSINE; -- Query vector (whatever your embedding model produced for "graph databases") LET $qvec = [0.12, 0.18, 0.27]; -- Vector search: top 2 nearest neighbours LET $vs = SELECT id FROM test WHERE embedding <|2,100|> $qvec; -- Full‑text search: top 2 lexical matches LET $ft = SELECT id, search::score(1) as score FROM test WHERE text @1@ 'graph' ORDER BY score DESC LIMIT 2; -- Fuse with Reciprocal Rank Fusion (k defaults to 60 if omitted) search::rrf([$vs, $ft], 2, 60);
Output of the final search::rrf() query:
[ { distance: 0.0034969844824588314f, ft_score: 0.5366538763046265f, id: test:1, rrf_score: 0.03278688524590164f }, { distance: 0.056393806565797844f, id: test:3, rrf_score: 0.016129032258064516f } ];
search::score
The search::score
function returns the relevance score corresponding to the given ‘matches’ predicate reference numbers.
API DEFINITIONsearch::score(number) -> number
The following example shows this function, and its output, when used in a RETURN
statement:
SELECT id, title, search::score(1) AS score FROM book WHERE title @1@ 'rust web' ORDER BY score DESC;
Output[ { id: book:1, score: 0.9227996468544006, title: [ 'Rust Web Programming' ], } ]