Why agentic memory?

An agent without durable memory cannot build on prior conversations. Every new session starts cold: the agent cannot personalise responses, cannot build on prior decisions, and cannot recognise that it has been asked the same question before. This page explains why memory matters, and why the most common shortcuts are insufficient.

The bar: how people remember

People rarely “search their transcript”. They associate (“cat” pulls up pets, lions, teams, stories) and they give beliefs timestamps and scope (“I have one now”, “I saw one yesterday”, “they weigh about four kilos in general”, “I used to, years ago”). Spectron is built to emulate that shape of memory for agents: a graph of entities and relations for association, six experiential categories for what kind of thing was learnt, and tri-temporal fields so present, past, and general facts do not collapse into one undated chunk. The implementation is precise – provenance, scopes, reconcilers – but the intent is approachable: make agent memory feel less alien and more like something you could explain over coffee.

What happens without memory

Context loss across sessions

Each conversation turn exists in isolation. The moment a session ends, everything learned is discarded. If a user told the agent their preferred time zone last Tuesday, the agent will ask again on Wednesday. If a support agent resolved a billing issue in March, the agent in April knows nothing about it. The user experience is that of speaking to a system that never listens.

Within a session, context window management compounds the problem. As conversations grow, earlier turns are truncated or summarised to fit within the model's token limit. Even intra-session memory degrades over time.

No personalisation

Personalisation requires knowing things about the person: their role, their preferences, their history with the product, their current project. Without persistent memory, every response must be generic. The agent cannot adapt tone, skip explanations the user has already heard, or surface information relevant to their specific situation.

No learning from past interactions

When an agent makes a mistake – misunderstands a user's intent, applies the wrong policy, gives outdated information – there is no durable record of the correction, so the same failure can recur in the next session.

For agentic workflows that run autonomously over long periods, this is especially costly. An agent executing a multi-step research task cannot resume where it left off if the session is interrupted. An agent coordinating with other agents cannot rely on a shared understanding of what has already been done.

Why vector stores alone are insufficient

The standard response to the memory problem is to embed past interactions and retrieve similar chunks at query time. This is better than nothing, but it has structural limitations that prevent it from serving as a genuine memory layer.

No structure, no verifiability

With vector stores alone, memory is text chunks. Retrieval returns the chunks most similar to the query. There is no entity model – no concept of "user", "project", or "preference". There is no way to ask "what is Christian's current role?" and get a direct answer; instead, you get chunks that mention Christian and hope the right one surfaces.

With vector stores alone, you cannot inspect what the system "knows" in any meaningful sense. The memory is an opaque cloud of embeddings. There is no way to verify correctness without re-running queries and examining outputs.

No provenance

With vector stores alone, which conversation produced a given chunk? When was it captured? Has the underlying fact since been corrected? A vector store has no answers to these questions. Retrieved context may be outdated, contradictory, or sourced from an unreliable turn – and there is no way to know.

No correction tracking

With vector stores alone, when a user corrects the agent – "actually, I switched roles in January" – the store cannot reconcile this with prior data. Both the old and new statements exist as equal-weight chunks. Future retrieval may return either one, or both, leaving the model to guess which is current. There is no supersession mechanism, no temporal ordering of facts, no way to mark old information as invalid.

No temporal validity

With vector stores alone, facts have no reliable lifespans in storage. A user's current project is not the same as their project six months ago. A pricing policy changes. An employee changes teams. Without temporal validity on stored facts – valid_from and valid_until – memory accumulates stale data with no expiry and no way to differentiate current from historical facts at retrieval time.

Retrieval is a guess

With vector stores alone, semantic similarity is not the same as relevance. A chunk retrieved because it is textually similar to a query may not be factually relevant. High-similarity matches can be coincidental; low-similarity matches may be exactly what is needed. Ranking by cosine distance alone is not a reliable retrieval strategy for factual queries.

What structured memory provides

Spectron addresses each of these gaps:

Problem	Spectron's approach
No structure	Extracted entities, attributes, and relations stored as a queryable graph in SurrealDB
No provenance	Every record carries a `source` object (kind, ref, spans, trust, derivation); see Provenance and traceability
No correction tracking	Supersession chains plus explicit `uncertainty` for cross-provenance clashes
No temporal validity	Tri-temporal model (system, known, and valid time)
No verifiability	Traces (`retrieval_trace`, `decision_trace`, `response_trace`) as substrate nodes, not disposable logs
Unreliable retrieval	Hybrid structural retrieval plus tiered resolution

The result is a memory layer you can trust: correctness can be demonstrated, not just assumed. See The accuracy promise for how that works in practice.

Agent memory is not a cache

Many “memory” products are really caches: store an output, retrieve something similar later. A cache can tell you what was returned last time; it cannot reliably tell you what the agent believed at the time, why that belief changed, or which source contradicted which.

Spectron is built as state management first and retrieval second. Embedding-and-ranking finds candidates; supersession, provenance, and traces keep a defensible current view of the world as it changes. That is the harder problem most teams skip until something breaks in production.

Long sessions and context limits

Even within a single session, context windows force truncation or compaction. If durable memory only lives in the transcript, anything not carried forward in the summary is gone when the session ends.

Spectron extracts and reconciles important facts into structured memory as turns arrive, runs consolidation and elaboration between interactions, and keeps the full episodic record citeable via provenance — so you are not betting everything on one compaction at the right moment. See Supersession, decay, and forget.