Operations

Models and providers

Provider keys and per-context model routing.

Spectron uses LLMs and embedding models at multiple stages of the extraction and recall pipeline. Each stage can be configured independently, and each context can override the server-level defaults.

ProviderModels supported
OpenAIgpt-4o, gpt-4o-mini, text-embedding-3-small, text-embedding-3-large
Anthropicclaude-opus-4, claude-sonnet-4, claude-haiku-4

Embedding models are only available via OpenAI. If you configure Anthropic as your primary provider, you must still supply an OpenAI key for embedding generation.

Set provider keys and default model assignments via environment variables before starting the server:

# Provider keys
SPECTRON_OPENAI_API_KEY=sk-…
SPECTRON_ANTHROPIC_API_KEY=sk-ant-…

# Model defaults (optional – these are the built-in defaults)
SPECTRON_MODEL_EXTRACTION=gpt-4o-mini
SPECTRON_MODEL_QUERY_UNDERSTANDING=gpt-4o-mini
SPECTRON_MODEL_RESPONSE=gpt-4o
SPECTRON_MODEL_REFLECTION=gpt-4o
SPECTRON_MODEL_BACKGROUND=gpt-4o-mini
SPECTRON_MODEL_EMBEDDING=text-embedding-3-small

Or in the YAML configuration file:

providers:
openai:
api_key: ${SPECTRON_OPENAI_API_KEY}
anthropic:
api_key: ${SPECTRON_ANTHROPIC_API_KEY}

models:
extraction: gpt-4o-mini
query_understanding: gpt-4o-mini
response: gpt-4o
reflection: gpt-4o
background: gpt-4o-mini
embedding: text-embedding-3-small

Server-level defaults apply to all contexts that have not set their own model configuration.

Spectron's extraction and recall pipeline has six model stages. Each can run a different model, allowing cost/quality trade-offs per stage.

StagePurposeRecommended model
extractionEntity and attribute extraction from turnsgpt-4o-mini
query_understandingParsing and expanding recall queriesgpt-4o-mini
responseGenerating agent responses (optional)gpt-4o
reflectionSynthesising summaries from accumulated memorygpt-4o
backgroundDeferred resolution and classification jobsgpt-4o-mini
embeddingEmbedding generation for vector searchtext-embedding-3-small

Extraction and background run on every turn. Keeping these on a fast, inexpensive model (gpt-4o-mini) is important for throughput. Reflection is called less frequently but benefits from a more capable model.

Each context can override the server defaults for any or all stages. Overrides are set via the management API:

await memory.config.models(
extraction="gpt-4o-mini",
query_understanding="gpt-4o-mini",
response="gpt-4o",
reflection="gpt-4o",
background="gpt-4o-mini",
embedding="text-embedding-3-small",
)

You can also set provider keys per context. This allows different contexts to use different OpenAI organisations or Anthropic accounts:

await memory.config.providers(
openai=os.environ["ACME_OPENAI_KEY"],
anthropic=os.environ["ACME_ANTHROPIC_KEY"],
)

You do not need to specify all stages. Unspecified stages inherit the server default:

# Only override reflection; everything else uses server defaults
await memory.config.models(reflection="claude-opus-4")

Provider keys are stored encrypted at rest in SurrealDB, scoped to their context. The API surface is write-only: once a key is set, you cannot retrieve the plaintext value through the API.

Read operations on provider configuration return a summary object indicating which providers are configured:

config = await memory.config.get()
print(config.providers_configured)
# {"openai": True, "anthropic": True}

The actual key values are never exposed. To rotate a provider key, simply write the new value – it overwrites the previous entry.

await memory.config.providers(openai=os.environ["NEW_OPENAI_KEY"])

Changing the LLM provider (e.g. from OpenAI to Anthropic) for extraction or reflection stages takes effect immediately for all new turns. Existing extracted memory is not affected.

Changing the embedding model has a broader impact. Embeddings stored in SurrealDB are generated by a specific model. If you change the embedding model, existing embeddings become incompatible with new embeddings – similarity scores will be unreliable.

If you need to change the embedding model in production:

  1. Set the new embedding model in the context or server configuration.

  2. Trigger a re-embedding job to regenerate all existing vectors:

spectronctl contexts reembed ctx_acme --api-key mgmt-… --watch

This is a background operation. New turns continue to be processed during re-embedding using the new model. Recall quality may degrade temporarily while re-embedding is in progress.

You can mix providers across stages. A common pattern is to use OpenAI's cheaper models for extraction and Anthropic's more capable models for reflection:

await memory.config.models(
extraction="gpt-4o-mini",
background="gpt-4o-mini",
query_understanding="gpt-4o-mini",
reflection="claude-opus-4",
response="claude-sonnet-4",
embedding="text-embedding-3-small",
)

await memory.config.providers(
openai=os.environ["OPENAI_KEY"],
anthropic=os.environ["ANTHROPIC_KEY"],
)

Both provider keys must be present for the model assignments to work. Spectron validates provider key availability at context configuration time and raises an error if a required key is missing.

Was this page helpful?