Skip to content
NEW BENCHMARKS

SurrealDB 3.x by the numbers

View

1/3

Spectron

Agent memory
you can trust

The memory and knowledge layer for AI agents. Built on SurrealDB.

A stateless application tier in front of SurrealDB - graph, vector, document and structured records committed in one ACID transaction per write. Every fact knows where it came from. Every correction supersedes - never overwrites. Every query produces a trace the substrate keeps.

Works with

CursorClaude DesktopClaude Code

Invites roll out in weekly batches from launch week - the waitlist is the application, no sales call, no demo gate.

WHAT YOU CAN DO TODAY

Three ways in, while the waitlist runs

Spectron is in invite-only preview, but the substrate underneath ships today and the architecture and technical deep-dive are open right now. Pick whichever lane fits your evaluation.

SPECTRON

One substrate.
One transaction.
One memory layer.

Documents, conversations, entities, attributes, relations, embeddings and traces all live in one ACID-transactional database. Spectron is the stateless application tier on top - no cross-store stitching, no consistency gaps between vector and graph, no sidecars to operate.

SPECTRON

One substrate.
One transaction.
One memory layer.

Documents, conversations, entities, attributes, relations, embeddings and traces all live in one ACID-transactional database. Spectron is the stateless application tier on top - no cross-store stitching, no consistency gaps between vector and graph, no sidecars to operate.

01 |THE PROBLEM

Memory wrong is worse than no memory

Most agent memory is a vector index stitched to a graph store and a row store, with the seams smoothed over by application code. It works in demos and breaks in week three. Four failure modes are predictable - and none of them are retrieval problems.

Entity collisions

'Beth' the customer and 'Beth' the support agent both get a vector. Similarity search returns both. The agent confuses the two. There is no schema to disambiguate - the schema is the embedding.

Silent overwrites

The user said Berlin in March, Paris in April. The vector store keeps both. The agent picks the higher-similarity one. The old belief is gone, the supersession is invisible, the test passes.

Scope leakage

Tenant A's embedding looks like tenant B's. A poorly scoped query returns the wrong one. Multi-tenancy lives in application code instead of the data layer - and application code drifts.

Untraceable answers

The agent gave a confident answer. The customer asked which memory it came from. There is no record of which fact, which conversation, which turn, which document. Memory is a black box; trust collapses.

02 |EIGHT PILLARS

The model that holds it together

Eight primitives the substrate, the write path, and the read path all have to support. Not features added on top - the only eight things memory has to do to hold up under scale, change and multi-instance deployment. Everything else in Spectron is a way to make one of them work in production.

Authoritative

Curated artefacts. Documents, manuals, policies, code, structured exports. Higher default trust because the source is vetted.

Experiential

Conversational input. What the agent or the user said, observed, or was told. Lower default trust than the authoritative layer.

Reconciliation

One function for all writes. Document, turn, reflection, elaboration and consolidation flow through the same supersession-and-uncertainty path. Identical guarantees regardless of how a fact arrived.

Elaboration

New durable memory from links the substrate already implies. A background sweep finds entities and attributes that share context but no relation, and proposes connections through the reconciler.

Reflection

On-demand synthesis. POST /reflect runs a synthesis pass over retrieved context; the answer is optionally persisted as new facts with their own provenance kind and lower default trust.

Consolidation

Belief crystallisation. An async job pools recent facts and decides to create, update (delta recorded), or mark superseded. Each observation tracks its derived inputs and proof count.

Calibration

Source trust and reconciler confidence stored separately, both queryable. Below a configurable floor, the reconciler refuses to supersede and emits uncertainty instead.

Collective

Shared, reconcilable memory across people, agents and instances. Corroboration thresholds promote facts to wider scope, with provenance preserved.

03 |HOW IT WORKS

Ingest, extract, connect, query

Four stages turn conversations and documents into a typed entity and relation graph - with provenance on every row, supersession instead of overwrite, and a trace recorded for every read and write.

1. Ingest

Conversations, tool outputs, and multi-modal artefacts - text, code, JSON, CSV, PDFs, images, audio, video. Originals live in object storage; the database holds the structured index. Files are Blake3-hashed, so re-uploading is a no-op and chunks have stable IDs across reingest.

2. Extract

Typed entities, attributes, and relations extracted from every turn and chunk by an LLM. Embeddings, RAKE keywords, and geographic location attributes generated alongside. Document-extracted facts and turn-extracted facts flow through the same reconciler - identical guarantees regardless of source.

3. Connect

The reconciler supersedes prior beliefs instead of overwriting them. Cross-provenance conflicts emit explicit uncertainty rather than silently picking a winner. Background reflection, elaboration, and consolidation deepen the graph between interactions, not only during them.

4. Query

A four-tier ladder - direct lookup, response reuse, hybrid retrieval, full-context fallback - gated by query understanding at the front. Every read emits a retrieval_trace recording the tier entered, the candidates considered, and the rows returned.

Ingest
Raw text
Text chunks
Extract
LLM
Entities & facts
name + type, subject→verb→object
Embeddings
Provider-pluggable
Store
SurrealDB
SurrealDB
ACID transaction
attributes
chunksHNSW
entitiesembeddings
relationsedges
tracesaudit log

FIVE COHERENCE DIMENSIONS

Memory that holds up along five axes

Vector-only memory hits walls on four of these five; row-store memory hits walls on all five. Spectron's substrate stores enough metadata to answer questions on each axis - simultaneously, in the same fused ranker.

Semantic

Embedding-based recall over the unified substrate. Entities, attributes, chunks, and modality-native vectors fuse into a single ranking - not a sidecar vector store.

Lexical

Every extracted attribute carries byte offsets back to the originating turn or document passage. BM25 over surface text catches exact-phrase and rare-term queries embeddings underweight. Citations are stored, not best-effort.

Relational

One unified entity and relation graph. Entailment, contradiction, and corroboration are graph operations you can compose, not joins to chase across stores.

Temporal

Three clocks tracked separately. System time (the MVCC byte-level audit clock), known time (when Spectron first believed a fact), and valid time (when the assertion held in the world). Different questions, all queryable independently.

Spatial

Geographic recall is first-class. Within-radius, inside-polygon, and nearest-k spatial predicates compose with semantic and graph signals in the same ranker. Mentioned places, postal addresses, and image EXIF coordinates are extracted as typed location attributes at write time.

chunk
embeddingvector
text
extracted_from
attribute
text
confidence
valid_from
has_source
source
kind
trust
belongs_to
entity
type
name
embeddingvector
relation
kind (verb)·valid_from·valid_until
entity
type
name
embeddingvector

04 |MEMORY MODEL

Six typed memory categories, not one bucket

Experiential memory is not a single store. It is six typed sub-stores - one raw record and five extracted categories built on top - each with its own schema, lifecycle, retrieval weight, and prompt-injection path.

Episodic

The raw conversational record. Sessions and turns as authored, in order, with anaphora intact. The source of truth every extracted category cites back to via byte-level spans.

Identity

Durable facts about the principal: name, role, employer, preferences, long-lived attributes. Long retention, low decay, surfaced into the prompt-ready profile.

Knowledge

Things the principal has learnt or shared, distinct from authoritative documents. Project facts, observations, references. Medium retention, decays without reinforcement.

Context

What is going on right now: active topics, recent intents, the working set for the current conversation. Short retention, replaced rapidly, anchors the next turn.

Instructions

Behavioural memory, not factual. 'Always answer in British English', 'never call me by my first name'. Applied at prompt-assembly time, not at retrieval.

Uncertainty

Explicit 'we don't know yet' rows. Emitted by the reconciler when confidence is below the floor, by cross-provenance contradictions, by ungrounded references. Gaps are visible, not papered over with hallucinations.

05 |DOCUMENT PIPELINE

Documents become structure - not a blob in a vector store

Plain text, markdown, code, JSON, CSV, PDFs, images, audio and video flow through a proper knowledge-representation pipeline. Both the content index and the structural index that retrieval reads from are built at write time - so the cost ladder downstream stays tractable.

Multi-modal ingest

Configurable per Context via an IngestionProfile (TextOnly → MultimodalFull) that trades completeness against cost. Pay for what you actually need; lift it later without re-architecting.

Object store for originals

PDFs and media originals live in S3-compatible storage. The database holds the structured indexable state. Backups, GDPR deletion, and storage cost scale on object-store economics, not OLTP-row economics.

Content-addressed

Every uploaded file is Blake3-hashed; the hash is the identity. Re-uploading the same file is a no-op, chunks have stable IDs across reingest, and an extracted-text cache lets you rechunk and reembed without re-running expensive parsers.

Content-aware chunking

Passages are first-class chunk rows with their own embeddings, byte spans into the original artefact, and edges to the entities extracted from them. Time-coded segments for audio and video answer 'where in this recording' instead of 'this file is somehow relevant'.

Same reconciler as turns

Document-extracted facts flow through the same supersession-and-uncertainty function as turn-extracted facts. A document can contradict a turn and produce explicit uncertainty - identical guarantees, regardless of source.

Keyword graph + understanding

A non-LLM RAKE pass produces first-class keyword nodes with PMI-scored edges to chunks and entities - cheap structural recall for rare-term queries that vector search underweights. Document-level summaries, summary embeddings, and document-to-document links sit in the same graph.

06 |AUDIT & TIME

Every fact knows where it came from - and when it was true

Provenance is a stored field on every fact-bearing row. Three independent clocks answer three different questions. Traces are first-class nodes that feed back into ranking. The audit story is queryable substrate state - not a logging pipeline bolted on.

Provenance as data

Every entity, attribute, relation, instruction, and uncertainty row carries a source object: kind (turn, document, reflection, elaboration, consolidation), ref, trust, lexical span, location, and derived_from. Reconciliation compares provenance; calibration weights it; supersession audits it - down to the bytes.

Three independent clocks

System time (the MVCC byte-level audit clock - 'what did the substrate look like at instant T?'), known time (when Spectron first believed a fact), and valid time (when the assertion held in the world). Different questions, all queryable independently via VERSION and as_of.

Traces are memory

retrieval_trace, decision_trace, and response_trace are first-class graph nodes, not external observability data. The unified ranker reads its own history; supersession lineage downgrades source trust; any answer walks back to the bytes that produced it.

Supersede, don't delete

Memories are not overwritten - they are superseded with valid_until set, or aged out with a reason recorded. The history of how a fact changed is queryable. forget is an explicit first-class verb, distinct from natural aging, with a --purge option for hard removal.

07 |AUTONOMOUS UNDERSTANDING

Memory that improves between conversations

Three named mechanisms generate new memory at different times, all flowing through the same reconciler with explicit provenance. Spectron does not just store what agents tell it - the substrate deepens its own understanding between interactions.

Reflection

On-demand synthesis. POST /reflect runs an LLM pass over retrieved context and optionally persists the synthesised answer as new facts - with their own provenance kind and a lower default trust, so calibration stays honest.

Elaboration

Background sweep. A job walks the substrate looking for entities and attributes that share context but no explicit relation. An LLM proposes the link; the reconciler accepts, supersedes, or surfaces it as uncertainty.

Consolidation

Belief crystallisation. An async job pools recent facts and decides to create, update (delta recorded), or mark superseded. Each observation tracks its derived inputs and proof count, so the evolution of belief is replayable.

08 |HYBRID RETRIEVAL

Eight signals fused into one auditable ranking

Embeddings-only retrieval has known failure modes - near-duplicates dominate top-k, rare-term queries miss the right chunk, structure between facts is invisible. Spectron's structural index is built at write time and read cheaply on every retrieval. Per-feature scores ride on every trace, so any result is auditable as a weighted combination of signals - not a black-box top-k.

Vector recall

Dense embeddings on entities, attributes and chunks - plus modality-native vectors for images and audio when enabled.

Lexical (BM25)

BM25 over chunk text and entity surface forms catches the exact-phrase and rare-term queries embeddings underweight.

Graph traversal

One or two hops from a seed entity surface related facts even when surface forms diverge. The graph is the substrate, not a sidecar.

Keyword bridges

RAKE-derived keyword nodes connect chunks and entities that share rare or discriminative terms but sit far apart in embedding space. PMI-scored so common terms don't dominate.

Document links

Section embeddings, chunk-to-chunk and document-to-document edges surface related sections of related documents - not only the single closest chunk.

Personalised PageRank

Graph-walk scoring biased toward the query's seed nodes - a structural feature in the fused ranker, not a separate pipeline.

Geographic recall

Within-radius, inside-polygon and nearest-k spatial predicates compose with semantic and graph signals in the same ranker. 'Discussions about Acme within 50km of Berlin in Q3' is one query, not three pipelines.

Trace-derived features

Retrieval reads its own history. Rows useful for similar queries get boosted; rows associated with corrections get demoted. The ranker learns from the trace graph in place.

09 |TIERED QUERIES

Cheap questions are cheap. Expensive questions are explicit.

Most memory layers run a vector search on every request. Spectron does not. Reads route through a four-tier cost and latency ladder gated by query understanding at the front. Every tier emits a trace recording which path was taken, why, and what it returned - so the cost story is observable per Context, not buried in a flat per-request average.

Tier 1 · Direct lookup

Typed questions - 'what is my role at Acme?', 'when did I join?' - fetched directly from the entity and attribute graph by key. No embeddings, no LLM, no ranking. Sub-millisecond.

Tier 2 · Response reuse

Semantic match against prior responses with entity-aware invalidation. The cache key is the query plus the set of facts the prior answer cited - when any cited fact is superseded, every dependent response is invalidated. Tens of milliseconds. No LLM call.

Tier 3 · Hybrid retrieval

Vector, BM25, graph traversal, keyword bridges, section embeddings, Personalised PageRank, geographic recall and trace-derived features fused into a single ranking, then synthesised by the LLM. Hundreds of milliseconds. Auditable per-feature scores ride on every retrieval trace.

Tier 4 · Full-context fallback

When hybrid retrieval is thin or low-confidence, fall back to a broader sweep: more candidates, deeper graph traversal, optionally a HyDE-style query rewrite, longer context window. Higher token cost and latency - explicit as the tier-4 escalation, not buried in a flat per-request average.

10 |IN PRODUCTION

Built for the second deployment, not the first demo

Calibration, scope isolation, observability, and security are stored substrate state - not configuration claimed in a README. Auditors verify the same way operators debug: by reading the graph.

Calibration

Two numeric quantities, both queryable. source.trust is the source prior (admin document > user assertion > reflection > elaboration). attribute.confidence is the reconciler posterior. Low-confidence facts never auto-supersede high-confidence ones; contradictions surface as uncertainty rows the operator can act on.

Scope separation

Each Context is its own SurrealDB namespace and database - cross-Context reads are not expressible in the API. Within a Context, every fact lives at a scope - a hierarchical path like org=acme/team=support - and principals carry per-verb grants (read, write, forget) over those paths, deny-by-default. An API key or on-behalf-of delegation can only narrow a principal's grant, never widen it, and a grant can carry a geographic predicate - so a fleet of agents can be partitioned by territory at the substrate level.

Observability

Cost, cache-hit rate, contradiction rate, supersession churn, source distribution, and retrieval quality are all derivable from the trace graph - no external metrics pipeline required. spectron entities show, spectron inspect trace, and spectron fsck make the substrate browsable as data, not as logs.

Security and privacy

At-rest encryption on the substrate, KMS-encrypted object storage, TLS 1.3 (mTLS optional) on every API. Two-layer prompt-injection scanning - ingest-time and substrate-wide. forget is a first-class verb: removes derived rows, originating turns, and the object-store original by content hash, with --purge for hard removal.

MODEL CONFIGURATION

Five model hooks, mixed and matched per Context

The right model for extraction is rarely the right model for synthesis, and neither is the right model for embedding. Each hook is an independent knob - configurable per Context, overridable per call, and recorded on every trace - so cost and quality slice per model out of the box. Anything OpenAI-compatible, Anthropic, Google, or local inference. Air-gapped deployments stay air-gapped.

Extraction

Turns and chunks into typed entities, attributes, relations. Usually a fast model with structured-output support.

Embedding

Vectors for entities, attributes, and chunks. Optional modality-native embedders for images, audio, and video.

Reconciliation

Optional LLM for entity matching when surface forms diverge and structural signals are inconclusive.

Synthesis

The model behind /chat and /reflect. Usually the most capable in the deployment; reasoning models fit naturally here.

Background

Elaboration and consolidation passes that propose relations and crystallise beliefs. Typically a cheaper model than synthesis.

INTEGRATIONS

MCP, generated SDKs, and harness adapters

One OpenAPI specification is the source of truth. The Spectron binary speaks Model Context Protocol natively, ships generated clients in four languages, and offers thin adapters for the harnesses that don't natively speak MCP - so clients cannot drift from the server.

MCP server, in the binary

Seven tools - remember, recall, context, reflect, forget, upload, inspect - wrapping the REST handlers over HTTP. Shared auth middleware; every response includes the trace ID. Smoke-tested against Claude Desktop, Cursor and Claude Code, with broader MCP client support landing during the early preview.

Generated SDKs

Python (pip install surrealdb[spectron]), TypeScript (@surrealdb/spectron), Kotlin (com.surrealdb:spectron) and Swift via SwiftPM. Idiomatic types, streaming helpers for chat and reflect, and built-in retry / idempotency. Regenerated on every release.

Harness adapters

LangChain (SpectronChatMessageHistory), Claude Code Stop hook, OpenAI Agents run callback, Vercel AI SDK middleware around streamText / generateText, plus n8n and Zapier nodes. Conversations flush into the substrate without per-call wiring.

Natural language query
Generate embeddingsProvider-pluggable
Vector searchchunks
HNSW · COSINE
Entity searchentities
HNSW · COSINE
Fact chain traversalentity → relation → entity
3-depth recursive DFS
Memories + facts

11 |BUILT ON SURREALDB

Multi-model, in one ACID transaction

SurrealDB unifies graph, vector, document, relational, and geospatial queries in one engine. The eight pillars, six memory categories, five coherence dimensions, and the trace graph are all expressible as rows and edges in a single transaction - which is what makes them deliverable as a coherent product instead of a stitching exercise.

Multi-model ACID

Entities, attributes, relations, embeddings, chunks, and trace edges commit atomically. No cross-store stitching, no eventual consistency between vector and graph.

Free time-travel

SurrealDB's MVCC layer means SELECT ... VERSION returns the exact substrate state at any past instant - the byte-level audit clock is a database feature, not application code.

Record-level permissions

Each Context is its own namespace and database; per-principal grants, RBAC, and per-record rules apply to memory the same way they apply to application data.

Scale-to-zero substrate

Compute-storage separation means an idle Spectron deployment costs nothing - no minimum cluster, no always-on vector index to pay for between conversations.

Memory branching

Coming soon

Compute-storage separation will let you branch an entire knowledge graph in seconds - a copy-on-write clone of the whole substrate for testing or evaluation, not an export stitched back together across separate stores.

12 |ARCHITECTURE

Middleware on fragments, or memory in the database

Most memory layers stitch two or three stores together - Postgres with pgvector and Neo4j, or DynamoDB with OpenSearch and Pinecone - and inherit the seams: no cross-store transactions, divergent consistency models, separate scaling stories. Memory middleware abstracts over the fragmentation but cannot eliminate it. Spectron removes it.

Memory Middleware
Agent
MiddlewareMem0 / custom
Vector DB
Graph DB
Relational DB
No ACID
Multiple round trips
Spectron
Agent
Spectron
SurrealDB
SurrealDB
Vectors
Graph
Documents
Temporal
ACID
Single round trip

Middleware on fragments

Sits on top of separate vector, graph, and relational databases. Cannot guarantee consistency between stores. Cannot scale to zero. Every layer adds latency and failure modes.

Spectron: memory in the database

Built directly on SurrealDB. One ACID transaction per memory write. Inherits compute-storage separation from the storage engine - scale down to zero when idle. No middleware tax.

Consistency

Middleware cannot enforce transactions across its backend databases. Spectron writes entities, facts, embeddings, and graph edges in one ACID transaction.

Scale to zero

Middleware backends run continuously. Spectron inherits SurrealDB's compute-storage separation - idle memory costs nothing.

No extra infra

Middleware requires you to provision and operate separate databases. Spectron runs on the same SurrealDB you already use for application data.

13 |WORKED EXAMPLE

A user changes their mind. Spectron tells the truth.

Three turns of a real conversation. The old fact is preserved, the new fact supersedes it, and the answer comes back with the trace that proves it.

TURN 1 · 15 MAR 2026

User: "I live in Berlin."

# Spectron writes
fact: user:emma lives_in city:berlin
valid_from: 2026-03-15
trace_to: turn_1

TURN 7 · 12 APR 2026

User: "Actually I moved to Paris last month."

# Spectron writes
fact: user:emma lives_in city:paris
valid_from: 2026-04-01
trace_to: turn_7
supersedes: prior fact (kept queryable)

QUERY · LATER

Agent: "Where does Emma live?"

# Spectron returns
answer: Paris
source: user_message[turn_7]
valid: from 2026-04-01
history: user_message[turn_1] (superseded)

FOR YOUR TEAM

Two paths into Spectron

One product, two on-ramps. Developers and AI engineers join the waitlist for early preview access; engineering and platform leaders evaluate fit for production with the team.

For developers and AI engineers

One static Rust binary with the MCP server built in - connecting Cursor, Claude Desktop or Claude Code is a single configuration entry. Native SDKs in Python, TypeScript, Kotlin, and Swift, generated from the OpenAPI spec. No Python in the pipeline, no sidecars. Join the waitlist for early preview access.

Join waitlist

For engineering leaders and platform teams

ACID writes across graph, vector, document and structured records. Context-level tenant isolation enforced at the engine. Provenance, tri-temporal history, and trace-graph audit baked into every fact. Self-host on SurrealDB, run on SurrealDB Cloud, or deploy in an air-gapped environment with local model inference.

Talk to sales

14 |MEASURED · OPEN

Numbers we will publish. Source you can read.

Trust is what gets agents into production. Two ways we earn it: the benchmarks we measure against, and the open-source database every Spectron deployment runs on.

Measured against published benchmarks

Spectron is evaluated on LoCoMo and LongMemEval, the two published conversational-memory benchmarks, alongside StateBench, an in-tree state-tracking suite, plus a document-retrieval regression harness with a factoid-versus-graph reporting split. Each run produces a comparable report against any commit, so ranking and reconciliation changes ship with measured deltas, not anecdotes.

Open at the foundation

SurrealDB, the database engine underneath Spectron, is open source and free to self-host (32.3k GitHub stars and counting). The Spectron memory layer on top is closed source today and ships as a single Rust binary. Our roadmap intent is to upstream foundational parts of the memory model into SurrealDB, so the most fundamental primitives stay open.

FREQUENTLY ASKED QUESTIONS

Spectron FAQ

EARLY PREVIEW · WAITLIST OPEN

Agent memory you can trust

The memory layer for AI agents. Built on SurrealDB.

SamsungNVIDIAAppleVerizonTencent

SOC 2 Type 2

GDPR

Cyber Essentials Plus

ISO 27001

Invites roll out in weekly batches from launch week. The waitlist is the application - no sales call, no demo gate.

SurrealDB

The context layer for AI agents.

Documents, graphs, vectors, time-series, and memory - in one transaction, one query, one deployment.

Explore with AI

Independently verified

SOC 2 Type 2

GDPR

Cyber Essentials Plus

ISO 27001

Trust Centre

Copyright © 2026 SurrealDB Ltd. Registered in England and Wales. Company no. 13615201

Registered address: 3rd Floor 1 Ashley Road, Altrincham, Cheshire, WA14 2DT, United Kingdom

Trading address: Huckletree Oxford Circus, 213 Oxford Street, London, W1D 2LG, United Kingdom