When you hear cat, you do not run one search string – you blend what it reminds you of (pets, lions, a team logo), exact words you once read, how things connect, and whether a fact is still true. Retrieval in Spectron is built the same way: several signals fused on purpose, not a single embedding score pretending to be understanding.
Five coherence dimensions
Memory is coherent along five axes at once – Spectron stores enough metadata to answer questions on each, so retrieval stays auditable and trustworthy:
| Dimension | What it gives you |
|---|---|
| Semantic | Similarity before structure is explicit: embedding-based recall over entities and passages. |
| Lexical | What was actually said or shown, down to character positions in the source: extracted attributes carry source.span into the originating turn or document passage. Citations are a stored field, not best-effort prose. |
| Relational | Understanding as connections: one entity/relation graph so “cat” can reach a manual, a prior turn, and a related entity (lion, pet, breed) without treating them as unrelated chunks. |
| Time | What held when, and how beliefs evolved: valid_from / valid_until, as_of, and time-travel queries. See Tri-temporal model. |
| Space | Where a fact was captured or applies – optional geometry; geo filters compose with semantic and graph signals in the same ranker. |
Vector-only approaches tend to miss several of these at once; unstructured-only stores miss them unless you add structure. Spectron stores the metadata up front.
Structural retrieval (beyond embeddings)
Retrieval is hybrid by design. Embeddings are one signal; they are fused with other precomputed structure so top‑k is not a black box.
Typical signals in the fused ranker include:
Vector recall – dense embeddings on entities, attributes, chunks, and (when enabled) images and audio.
Lexical recall – BM25 over chunk text and entity names for exact phrases and rare terms.
Graph traversal – limited hops from seed entities when surface forms differ.
Keyword bridges – keyword nodes linking distant passages.
Section embeddings and document links – related sections, not only the single nearest chunk.
Personalised PageRank – graph-walk scoring biased toward query seeds.
Geographic recall – radius, polygon, nearest‑k on stored geometry.
Trace-derived features – prior retrieval outcomes boost what worked; demote what led to corrections.
Each /query emits a retrieval_trace recording candidates, per-signal contributions, and the returned set.
Hands-on retrieval modes are in Hybrid search.
Tiered query resolution
Spectron does not run the same expensive path on every request. Reads route through a four-tier ladder so simple questions consume as few LLM tokens as possible – structured lookup and cache hits avoid building huge prompts or calling the synthesis model when a cheaper path suffices.
| Tier | What happens | Token / cost profile |
|---|---|---|
| 1 – Direct structured lookup | Typed questions resolved from the entity/attribute graph by key – no embeddings, no LLM, no ranking pass. | Minimal tokens – often nothing sent to an LLM. |
| 2 – Response reuse | Match against prior answers in the same Context and scope, with entity-aware invalidation (cited facts must still be current). Returns a prior answer when still valid. | No new generation on a hit – reuses prior synthesis. |
| 3 – Hybrid retrieval and synthesis | Fused retrieval, then LLM synthesis over a bounded context block. | Moderate tokens – default for open questions. |
| 4 – Full-context fallback | Broader sweep when tier 3 is thin: more candidates, deeper graph hops, optional query rewrite, larger context. | Highest token use – explicit escalation, still traceable. |
Tiers cascade (miss on 2 falls to 3; thin 3 escalates to 4). Each tier writes retrieval_trace metadata describing which tier ran and why – so you can see where token spend goes and tune per Context.
In short: most “what is Alice’s role?”-style questions should resolve without stuffing the entire memory graph into the model context.