Context layer whitepaper hero illustration

The context layer for AI agents

Why AI agents need a unified context layer built on a multi-model database - and why it must live in the database, not above it as middleware.

AI agents fail in production because of context, not models. The infrastructure between enterprise data and the AI model - the context layer - must live in the database, not above it as middleware and not inside an analytical platform as a bolt-on feature. This paper presents the SurrealDB agent infrastructure stack: a vertically integrated system spanning a distributed storage engine providing ACID storage on cloud object storage, SurrealDB (a multi-model database unifying documents, graphs, vectors, and time-series in a single engine), and Spectron (structured agentic memory with entity extraction, knowledge graphs, and temporal fact tracking). No other product ships this complete vertical. From object storage to agent memory - a single stack, a single transaction boundary.

ABSTRACT

The transition from stateless language model interactions to persistent, autonomous AI agent systems has exposed a fundamental infrastructure gap. Agents fail in production not because models lack reasoning capacity, but because they lack reliable, structured, real-time context. RAG over flat vector stores suffers from semantic fragmentation, temporal blindness, and relevance degradation at scale. Memory middleware that abstracts over multiple specialist databases introduces consistency gaps, latency overhead, and semantic drift between stores.

We argue that the context layer for AI agents must live in the database - not above it as middleware, not inside an analytical platform as a bolt-on feature - and that the database must be architecturally multi-model: documents, graphs, vectors, and time-series as native primitives in a single ACID-compliant transactional engine.

We present the SurrealDB agent infrastructure stack: a vertically integrated system spanning distributed ACID storage on cloud object storage, a multi-model database engine, and a structured agentic memory layer.

THE CONTEXT WALL

Agents fail because of context, not models

Every AI team hits the same wall. The model is capable, but the agent cannot remember what happened two turns ago, cannot connect a user preference to a product entity, and cannot guarantee that concurrent writes stay consistent.

Today's teams assemble five or six systems - a vector database, a document store, a graph database, a cache, a queue, and a memory middleware layer - then spend months writing glue code to keep them consistent. Every seam is a place where context leaks.

App state

User session

Vector DB

Embeddings

Graph DB

Relations

Doc store

Documents

Auth

Identity

Data fragments at every system boundary

Agent

Vector DB

Graph DB

Doc store

Auth

Total

Cumulative latency

WHY VECTOR STORES FAIL

Flat vector stores destroy context

The first generation of AI data infrastructure was built for a single operation: approximate nearest-neighbour search over embeddings. It fails as the primary data layer for agents because of six structural properties.

Flat storage

Documents are chunked and embedded as isolated vectors. The original structure - hierarchy, sections, cross-references - is destroyed at ingestion.

No relationships

No way to model how entities relate. 'Mercury' the project and 'Mercury' the element occupy the same region of vector space. Disambiguation is impossible.

No temporal awareness

Vector stores return what is similar, not what is current. An agent may retrieve a five-year-old address alongside the current one with no signal indicating which is valid.

Semantic fragmentation

~40% of naively chunked content becomes 'semantically invisible' - isolated from the entities it refers to. Pronouns and implicit references lose their targets.

Vocabulary mismatch

'Why is the app behaving strangely?' and 'Error 503: Service Unavailable' sit far apart in embedding space. Vector retrieval alone cannot bridge intent and fact.

Relevance degradation

Relevance deteriorates as the corpus grows. More documents mean more noise. Without structural signals, there is no way to distinguish 'similar text' from 'relevant context.'

THE FRAGMENTED MEMORY TAX

Five systems. Five failure modes.

The natural response to vector database limitations has been to add systems. This “frankenstack” approach solves individual capability gaps but introduces systemic problems.

Latency of reconstruction

Every time an agent needs to 'remember' something, it makes multiple asynchronous calls to different APIs. In an autonomous agent loop running at millisecond speed, these round-trips compound into seconds.

Semantic drift

A document updates in one store but its embedding in another is not refreshed. Its graph relationship remains stale. The agent reasons over contradictory information.

Loss of dimensionality

A standalone vector search can tell an agent 'Product A is similar to Product B,' but cannot explain why - the relationship type, the business constraint, the version history.

Consistency boundaries

Each system has its own consistency model. The agent's view of the world depends on which system it queries and when - non-deterministic behaviour invisible in development, catastrophic in production.

THE READ-THINK-WRITE LOOP

Data platforms answer “what happened.” The context layer answers “what should happen next.”

An AI agent operates through a continuous cycle: perceive the environment, reason over memory, commit an action or observation back to storage. This loop runs in milliseconds, and it never stops.

Modern data platforms - Snowflake, Databricks, BigQuery - are optimised for analytical throughput: answering questions about the past at scale. Agents don't need to analyse the past; they need to act in the present.

This is why the context layer must live in the database, not above it as middleware. Only at the database layer can context be kept consistent, governed, secured, and transactionally unified with the canonical knowledge that grounds it.

EVOLUTION

From metadata to context

Semantic layer

Metadata catalogs and data dictionaries. Tables, columns, and lineage - but no graph, no vectors, no temporal reasoning.

Knowledge graph

Graph databases for entities and relationships. Rich structure - but often separate from documents, vectors, and transactional data.

Context layer

Multi-model with temporal awareness. Documents, graphs, vectors, and time-series in one engine. The context layer must live in the database.

WHY THE DATABASE

The context layer must live in the database

ACID transactions

Read-think-write loops need atomicity. When an agent updates memory and state, both must commit or neither does.

Unified permissions

One permission model for documents, graphs, vectors, and memory. No separate auth for each system.

Single deployment

One database, one deployment. No middleware tax, no operational sprawl.

No middleware tax

Context engineering in one query. No round trips between systems, no latency compounding.

WHAT THE THESIS COVERS

1. The context wall

Why agents fail in production. Flat vector stores destroy context. The fragmented memory tax. The read-think-write loop.

2. The architecture

Four-layer stack: distributed ACID storage on object storage, multi-model database engine, structured agentic memory (Spectron), persistent file memory.

3. Distributed storage

Gen 3 distributed storage. TAPIR consensus, compute-storage separation, scale down to zero, instant branching, 99.999999999% durability.

4. SurrealDB

Multi-model unification. SurrealQL composable queries, schema-as-ontology, rich edges, bi-temporal versioning, built-in auth and APIs.

5. Spectron

Agentic memory layer. Entity extraction, knowledge graph construction, temporal fact tracking, hybrid retrieval, entity disambiguation.

6. Multi-agent coordination

Blackboard pattern, agentic race conditions, ACID snapshot isolation, context engineering in one query.

COMPETITIVE ANALYSIS

Why alternatives fall short

The thesis examines why each competing approach is architecturally insufficient for production agent systems.

vs. vector databases

Pinecone, Chroma, Weaviate. Flat storage, no relationships, no temporal awareness, semantic fragmentation, relevance degradation at scale.

vs. memory middleware

Mem0. Hides fragmentation but doesn't eliminate it. Consistency boundaries stop at the API layer. Can't scale to zero or branch memory.

vs. agent databases

HelixDB, HydraDB. Internalised frankenstack. No ACID across models, no document model, no enterprise governance, no Gen 3 storage.

vs. traditional databases

Postgres, MongoDB, Neo4j. Single-model engines that bolt on other models. Coupled compute and storage. Not designed for the agent loop.

vs. data platforms

Databricks, Snowflake. Optimised for analytical throughput - answering 'what happened.' Agents need 'what should happen next' at millisecond latency.

The vertical integration advantage

No other product owns the complete vertical from object storage to agentic memory. The capabilities emerge from the integration of all layers.

THE THESIS

From object storage to agent memory. A single stack. A single transaction boundary.

Multi-model unification

Documents, graphs, vectors, time-series, and relational data as native primitives in a single engine, queryable in a single statement, consistent in a single ACID transaction.

Structured memory

Entity extraction, knowledge graph construction, and temporal fact tracking operating over the unified data layer, not as a separate system.

Gen 3 storage

Compute-storage separation, commodity object storage, horizontal scaling, fault tolerance, and disaster recovery as architectural properties.

Production infrastructure

Authentication, access control, API endpoints, real-time subscriptions, and schema enforcement built into the database.

MEMORY LIFECYCLE

What to remember. What to forget.

Production memory systems must manage the full lifecycle of knowledge - not just ingestion and retrieval, but also decay and eviction. Infinite memory retention creates retrieval noise, stale fact contamination, and compounding cost.

Some systems approach this with probabilistic decay functions that automatically prune memories based on age and access frequency. This risks discarding information that appears low-value but is contextually critical. A rarely accessed medical allergy is more important than a frequently accessed coffee preference - but a frequency-based decay function treats them inversely.

SurrealDB's approach is deterministic. Facts carry explicit temporal metadata. Queries filter by time range and validity. The application - or a governance policy defined in the schema - decides what is retained and what is evicted, based on business rules rather than probabilistic heuristics.

ATOMIC CONTEXT

Context must be atomic because agents make decisions in real time

The central architectural claim of this thesis is that context for AI agents must be atomic - a single memory is not a collection of pointers across different systems but a unified record within a single transactional boundary.

In SurrealDB, a graph edge is a document. It can hold the weight of a relationship, the timestamp of the last interaction, the vector embedding of the context, and arbitrary metadata - all retrievable in a single query. No amount of orchestration middleware can make multiple databases behave as a single ACID-compliant transactional system.

If the context is inconsistent - if the graph says one thing and the vector store says another - the agent's reasoning is compromised. ACID transactions are the only mechanism that guarantees a consistent view across all data models at query time.

GET STARTED

Build on the context layer

From object storage to agent memory. A single stack with a single transaction boundary.