Models and providers

Spectron uses LLMs and embedding models at multiple stages of the extraction and recall pipeline. Each stage can be configured independently, and each context can override the server-level defaults.

Supported providers

Provider	Models supported
OpenAI	`gpt-4o`, `gpt-4o-mini`, `text-embedding-3-small`, `text-embedding-3-large`
Anthropic	`claude-opus-4`, `claude-sonnet-4`, `claude-haiku-4`

Embedding models are only available via OpenAI. If you configure Anthropic as your primary provider, you must also supply an OpenAI key for embedding generation.

Server-level defaults

Set provider keys and default model assignments via environment variables before starting the server:

# Provider keys
SPECTRON_OPENAI_API_KEY=sk-…
SPECTRON_ANTHROPIC_API_KEY=sk-ant-…

# Model defaults (optional – these are the built-in defaults)
SPECTRON_MODEL_EXTRACTION=gpt-4o-mini
SPECTRON_MODEL_QUERY_UNDERSTANDING=gpt-4o-mini
SPECTRON_MODEL_RESPONSE=gpt-4o
SPECTRON_MODEL_REFLECTION=gpt-4o
SPECTRON_MODEL_BACKGROUND=gpt-4o-mini
SPECTRON_MODEL_EMBEDDING=text-embedding-3-small

Or in the YAML configuration file:

providers:
  openai:
    api_key: ${SPECTRON_OPENAI_API_KEY}
  anthropic:
    api_key: ${SPECTRON_ANTHROPIC_API_KEY}

models:
  extraction: gpt-4o-mini
  query_understanding: gpt-4o-mini
  response: gpt-4o
  reflection: gpt-4o
  background: gpt-4o-mini
  embedding: text-embedding-3-small

Server-level defaults apply to all contexts that have not set their own model configuration.

Pipeline stages

Spectron's extraction and recall pipeline has six model stages. Each can run a different model, allowing cost/quality trade-offs per stage.

Stage	Purpose	Recommended model
`extraction`	Entity and attribute extraction from turns	`gpt-4o-mini`
`query_understanding`	Parsing and expanding recall queries	`gpt-4o-mini`
`response`	Generating agent responses (optional)	`gpt-4o`
`reflection`	Synthesising summaries from accumulated memory	`gpt-4o`
`background`	Deferred resolution and classification jobs	`gpt-4o-mini`
`embedding`	Embedding generation for vector search	`text-embedding-3-small`

Extraction and background run on every turn. Keeping these on a fast, inexpensive model (gpt-4o-mini) is important for throughput. Reflection is called less frequently but benefits from a more capable model.

Per-context model override

Each context can override the server defaults for any or all stages. Overrides are set via the management API:

await memory.config.models(
    extraction="gpt-4o-mini",
    query_understanding="gpt-4o-mini",
    response="gpt-4o",
    reflection="gpt-4o",
    background="gpt-4o-mini",
    embedding="text-embedding-3-small",
)

You can also set provider keys per context. This allows different contexts to use different OpenAI organisations or Anthropic accounts:

await memory.config.providers(
    openai=os.environ["ACME_OPENAI_KEY"],
    anthropic=os.environ["ACME_ANTHROPIC_KEY"],
)

Partial overrides

You do not need to specify all stages. Unspecified stages inherit the server default:

# Only override reflection; everything else uses server defaults
await memory.config.models(reflection="claude-opus-4")

Document ingest resolves the extraction LLM per Context on the worker for each job — including vision description under MultimodalFull. A Context with its own provider keys and models.extraction does not require a deployment-wide default LLM env var for ingest to run; maintenance and document jobs share the same budget-enforced resolver.

Provider key security

Provider keys are stored encrypted at rest in SurrealDB, scoped to their context. The API surface is write-only: once a key is set, you cannot retrieve the plaintext value through the API.

Read operations on provider configuration return a summary object indicating which providers are configured:

config = await memory.config.get()
print(config.providers_configured)
# {"openai": True, "anthropic": True}

The actual key values are never exposed. To rotate a provider key, simply write the new value – it overwrites the previous entry.

await memory.config.providers(openai=os.environ["NEW_OPENAI_KEY"])

Switching providers after deployment

Changing the LLM provider (e.g. from OpenAI to Anthropic) for extraction or reflection stages takes effect immediately for all new turns. Existing extracted memory is not affected.

Changing the embedding model has a broader impact. Embeddings stored in SurrealDB are generated by a specific model. If you change the embedding model, existing embeddings become incompatible with new embeddings – similarity scores will be unreliable.

If you need to change the embedding model in production:

Set the new embedding model in the context or server configuration.
Trigger a re-embedding job to regenerate all existing vectors:

spectronctl contexts reembed ctx_acme --api-key mgmt-… --watch

This is a background operation. New turns continue to be processed during re-embedding using the new model. Recall quality may degrade temporarily while re-embedding is in progress.

Mixing providers

You can mix providers across stages. A common pattern is to use OpenAI's cheaper models for extraction and Anthropic's more capable models for reflection:

await memory.config.models(
    extraction="gpt-4o-mini",
    background="gpt-4o-mini",
    query_understanding="gpt-4o-mini",
    reflection="claude-opus-4",
    response="claude-sonnet-4",
    embedding="text-embedding-3-small",
)

await memory.config.providers(
    openai=os.environ["OPENAI_KEY"],
    anthropic=os.environ["ANTHROPIC_KEY"],
)

Both provider keys must be present for the model assignments to work. Spectron validates provider key availability at context configuration time and raises an error if a required key is missing.