Configuration

Context config, models, and limits.

Spectron has two configuration surfaces: server-wide settings (in the server binary or environment) and per-Context configuration (stored in the control plane and patchable at runtime).

Server configuration is provided via environment variables or a TOML configuration file passed at startup. These settings apply to all Contexts unless overridden at the Context level.

VariableTypeDefaultDescription
SPECTRON_BINDstring0.0.0.0:8080Listen address and port
SPECTRON_SURREALDB_URLstringSurrealDB connection URL (required)
SPECTRON_SURREALDB_USERstringSurrealDB username (required)
SPECTRON_SURREALDB_PASSstringSurrealDB password (required)
SPECTRON_OBJECT_STOREstringlocal://./dataObject store backend (see below)

These apply to all Contexts that do not override them:

VariableDefaultDescription
SPECTRON_MODEL_EXTRACTIONgpt-4o-miniLLM for experiential memory extraction (Stage 1 fast pass)
SPECTRON_MODEL_EXTRACTION_STRONGgpt-4oLLM for extraction Stage 2
SPECTRON_MODEL_QUERY_UNDERSTANDINGgpt-4o-miniLLM for query classification
SPECTRON_MODEL_RESPONSEgpt-4o-miniLLM for response synthesis
SPECTRON_MODEL_REFLECTIONgpt-4oLLM for reflection operations
SPECTRON_MODEL_BACKGROUNDgpt-4o-miniLLM for background reconciliation
SPECTRON_MODEL_EMBEDDINGtext-embedding-3-smallEmbedding model for the deployment (1536-dim in this release)

The embedding model is fixed per deployment for launch — it is not a per-Context override. Context config rejects any models.embedding value other than the deployment default. Changing the server embedding model requires a reindex so vectors and HNSW indexes stay in the same embedding space.

Cross-encoder reranking for /documents/query when use_reranker=true:

VariableDescription
SPECTRON_RERANKER_URLPOST endpoint for the reranker service. Unset ⇒ no provider; requests fall through to bi-encoder ordering.
SPECTRON_RERANKER_MODELRequired when URL is set. Boot error if URL is set without a model.
SPECTRON_RERANKER_API_KEYOptional bearer token (Authorization: Bearer …).

HTTP OCR, CLIP, and speech-to-text for document ingestion (read by the worker role). An HTTP provider takes precedence over the built-in local fallback for the same modality. Misconfigured URLs fail at boot.

VariableDescription
SPECTRON_OCR_URLPOST endpoint for OCR. Unset ⇒ built-in or local Tesseract (when enabled).
SPECTRON_OCR_MODELRequired when OCR URL is set.
SPECTRON_OCR_API_KEYOptional bearer token.
SPECTRON_CLIP_URLPOST endpoint for visual embeddings. CLIP output must match the 512-dim image_chunk width.
SPECTRON_CLIP_MODELRequired when CLIP URL is set.
SPECTRON_CLIP_API_KEYOptional bearer token.
SPECTRON_STT_URLPOST endpoint for speech-to-text.
SPECTRON_STT_MODELRequired when STT URL is set.
SPECTRON_STT_API_KEYOptional bearer token.

See Multimodal content.

VariableDescription
SPECTRON_OPENAI_API_KEYDefault OpenAI API key for all Contexts
SPECTRON_ANTHROPIC_API_KEYDefault Anthropic API key for all Contexts
BackendSPECTRON_OBJECT_STORE formatNotes
Local filesystemlocal:///path/to/dataDevelopment and single-node deployments
Amazon S3s3://bucket-name/prefixRequires AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY or instance role
Google Cloud Storagegcs://bucket-name/prefixRequires GOOGLE_APPLICATION_CREDENTIALS
Azure Blob Storageazure://container/prefixRequires AZURE_STORAGE_ACCOUNT + AZURE_STORAGE_ACCESS_KEY

Cross-origin browser calls to the API are off by default. Enable an origin allowlist when a web client (for example Surrealist against a Cloud-brokered access token) calls the user API from a different origin than the API host.

ServiceVariableCLI flag
User APISPECTRON_CORS_ALLOWED_ORIGINS--cors-allowed-origins
Management APISPECTRON_MANAGEMENT_CORS_ALLOWED_ORIGINS--cors-allowed-origins

Comma-separated origins. Entries are trimmed, lower-cased, and normalised (trailing / stripped). Exact entries match the Origin header verbatim; entries containing * are anchored globs on both sides (bare * or https://* are rejected). Allowed origins are echoed in Access-Control-Allow-Origin; credentials are not used — callers authenticate with Authorization, not cookies. Preflight mirrors request headers so SDK headers (api-version, X-Spectron-Context, Idempotency-Key, and others) pass without a fixed allowlist.

The management API is normally server-side only; CORS is optional there for operator tooling.

Each Context stores a config object in the control plane. This is updated via PATCH /api/v1/contexts/{id} and applies immediately to new requests.

{
"config": {
"token_limit": 1000000,
"retention_days": 90,
"models": {
"extraction": "openai/gpt-4o-mini",
"extraction_strong": "openai/gpt-4o",
"query_understanding": "openai/gpt-4o-mini",
"response": "openai/gpt-4o-mini",
"reflection": "openai/gpt-4o",
"background": "openai/gpt-4o-mini"
},
"providers": {
"openai": "sk-...",
"anthropic": "sk-ant-..."
}
}
}
FieldTypeDescription
token_limitinteger (optional)Monthly token cap across all LLM + embedding calls. Enforced before each ingestion write. null = no limit.
retention_daysinteger (optional)Automatic expiry for context-category experiential memory data. null = no automatic expiry.
models.extractionstringModel for Stage 1 fast extraction. Format: provider/model-name.
models.extraction_strongstringModel for Stage 2 strong extraction.
models.query_understandingstringModel for classifying and expanding queries.
models.responsestringModel for synthesising formatted context responses.
models.reflectionstringModel for reflection operations (reflect endpoint).
models.backgroundstringModel for background reconciliation and cross-layer linking.
providers.openaistringOpenAI API key for this Context. Overrides the server-wide default.
providers.anthropicstringAnthropic API key for this Context.

Provider API keys are write-only on the API surface. The read projection for a Context replaces the key values with a providers_configured summary — names of providers for which this Context stores its own key:

{
"config": {
"providers_configured": ["openai", "anthropic"]
}
}

This is not the same as GET /api/v1/{context_id}/providers, which lists providers reachable via a global deployment key or a per-Context key and includes selectable model ids. The two surfaces are not derivable from each other.

The raw key values never appear in read responses.

Send only the fields you want to change. Unset fields are left unchanged (deep merge):

PATCH /api/v1/contexts/acme-prod
Content-Type: application/json
Authorization: Bearer mgmt-...

{
"config": {
"token_limit": 2000000,
"models": {
"reflection": "anthropic/claude-opus-4-7"
}
}
}

Additional per-Context settings control extraction behaviour:

FieldTypeDefaultDescription
extraction.stage1_thresholdfloat0.7Confidence threshold below which Stage 2 runs
extraction.max_entities_per_turninteger20Maximum entities extracted per turn
extraction.ontology_strictbooleanfalseReject entities/attributes not in the ontology
FieldTypeDefaultDescription
cache.semantic_ttl_secondsinteger3600TTL for semantic response cache entries
cache.semantic_thresholdfloat0.95Cosine similarity threshold for cache hits
FieldTypeDefaultDescription
rate_limit.requests_per_minuteinteger600Maximum requests per minute per Context
rate_limit.tokens_per_minuteinteger100000Maximum tokens per minute per Context (LLM calls)
FieldTypeDefaultDescription
allow_self_service_keysbooleantrueWhen false, members cannot mint keys via POST /{ctx}/keys; use Cloud-brokered access tokens only.
max_token_ttl_secondsinteger (optional)noneMaximum TTL clamp applied to every key mint (management, broker, self-service). null = no clamp.

Every data-plane API key must be bound to a principal. Keys with no principal binding are rejected with 401 — there is no unscoped passthrough mode. Mint keys under a principal (management API or self-service POST /{ctx}/keys).

Operator-tunable ceilings (env vars, read at process start):

VariableDefaultCaps
SPECTRON_DEFAULT_PAGE_SIZE100Default limit when listing session turns and omitted elsewhere
SPECTRON_MAX_LIST_LIMIT500Maximum rows per list response (list_turns, traces, audit)
SPECTRON_MAX_QUERY_K50Maximum limit / k on /query, /context, document query, and MCP recall / context. Clamp-down only — the env var can lower the ceiling but never raise it above 50. Default answer size k / limit is 10.
SPECTRON_RETRIEVAL_POOL_SIZE256Internal candidate-pool breadth for fused retrieval. Decoupled from kk only truncates the fused answer; raising k does not widen the search pool.
SPECTRON_TRACE_FEATURE_TTL_SECS60TTL (seconds) for the in-process per-(Context, scope) trace-features cache in the fused ranker — how long prior retrieval outcomes re-weight candidates before recomputation. Process-local; 0 or invalid values fall back to the default.

Requests above the query ceiling return 400 Bad Request.

FieldTypeDefaultDescription
reconciliation.confidence_floorfloat0.7Minimum confidence required for same-provenance supersession

When a per-Context field is not set, the server-wide default applies. The effective configuration for a Context is always visible at:

GET /api/v1/contexts/{id}

The response includes the config object with all effective values merged – Context-level overrides where set, server-wide defaults elsewhere.

Was this page helpful?