Reasoning model

Extraction pipeline

How Spectron classifies and extracts structured memory from conversation turns.

Every conversation turn Spectron receives passes through an extraction pipeline that turns raw text into structured memory: entities, attributes, relations, instructions, and uncertainties. The pipeline balances latency and accuracy — lightweight heuristics run first; language models are invoked only when the turn needs deeper interpretation.

  1. Heuristics — Pattern matching for known entities, temporal phrases (“since January”, “until next quarter”), and instruction-like language (“always”, “never”, “from now on”). Simple corrections to known facts can finish here without calling a model.

  2. Fast model — Handles most turns: new entities, preferences, straightforward assertions, and routine corrections.

  3. Stronger model — Reserved for harder cases: contradictions within one turn, ambiguous references, or output that fails structural validation.

You do not choose a stage. Spectron escalates automatically when the current stage cannot produce a confident result.

The pipeline returns a structured diff (also visible in the POST /facts response and session turn diffs):

{
"entities": [
{ "name": "Alice Chen", "type": "Person", "memory_category": "identity" }
],
"attributes": [
{ "entity": "Alice Chen", "key": "role", "value": "CTO", "memory_category": "identity" }
],
"relations": [
{ "subject": "Alice Chen", "verb": "works_at", "object": "Acme", "memory_category": "identity" }
],
"instructions": [
{ "label": "Bullet-point responses", "description": "Always respond using bullet points" }
],
"uncertainties": [
{ "about": "job title", "reason": "User joked about being CEO, then appeared to correct themselves" }
]
}

Each extracted entity, attribute, and relation carries a memory_category — one of identity, knowledge, or context. Invalid values are rejected with 400 Bad Request. Episodic transcript material stays on sessions and turns; instructions and uncertainties are stored separately. See Memory categories.

Nothing is written blindly: extractions pass through reconciliation before they become durable memory.

For interactive agents, extraction on a turn completes before the API returns, so the next /query, /state, or /profile call reflects what was just said.

curl -sS "$SPECTRON_URL/api/v1/$SPECTRON_CONTEXT_ID/sessions/$SESSION_ID/turns" \
-H "API-KEY: $SPECTRON_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"role": "user",
"content": "I'\''m Alice, CTO at Acme. Always respond in bullet points.",
"scope": ["org=acme", "user=alice"]
}'

The response includes the extraction diff and a trace_id for audit.

If structured extraction cannot be validated, Spectron still retains the turn text. The content remains searchable and can be reprocessed; you are not left with a silent failure. Open uncertainty records flag cases where the pipeline could not commit to a single interpretation — see Instructions and uncertainties.

Was this page helpful?