Migrate

Migrate from a vector store

From chunk-only search to structured memory.

Vector stores are a common first step for adding memory to AI agents. They work well for simple document retrieval but fall short when you need structured extraction, provenance tracking, temporal validity, or authority-based conflict resolution. This guide covers what you gain by migrating to Spectron and how to execute the migration.

CapabilityVector storeSpectron
Document retrievalSemantic similarity searchHybrid (semantic + BM25 + graph traversal)
Structured extractionNone – raw chunksEntities, attributes, and relations
ProvenanceNone – chunk origin onlyEvery fact traces back to the turn that produced it
Correction trackingNoneSupersession chains with full history
Temporal validityNonevalid_from / valid_until on every attribute
Authoritative precedenceNoneAuthoritative pillar wins over Experiential assertions when they conflict
Scope-based isolationManual metadata filteringDimensional scope with floor matching
State inspectionNoneQuery memory state directly

The decision to migrate usually comes when one of the following becomes a pain point: contradictions silently accumulating, inability to correct or expire facts, no insight into what the agent actually knows, or metadata filtering becoming unmanageable.

Export the source documents from your vector store. For most systems, this means exporting the original text rather than the embeddings – Spectron re-indexes everything through its own pipeline.

# Example: export from Pinecone
import pinecone

index = pinecone.Index("my-index")
# Fetch all vectors with metadata (Pinecone requires pagination)
results = index.query(
vector=[0.0] * 1536,
top_k=10000,
include_metadata=True,
)
documents = [
{"text": r.metadata["text"], "source": r.metadata.get("source", "")}
for r in results.matches
]
// Example: export from a generic vector store
const documents = await vectorStore.fetchAll({ includeMetadata: true });
const exportedDocs = documents.map(doc => ({
text: doc.metadata.text,
source: doc.metadata.source ?? "",
}));

Ingest your exported documents into Spectron's authoritative knowledge layer. Each document is processed through the ingestion pipeline: chunked, keyword-extracted, and linked to knowledge nodes.

import os
import httpx

client = httpx.Client(
base_url="https://spectron.surrealdb.com/api/v1/my-context",
headers={"API-KEY": os.environ["SPECTRON_API_KEY"]},
)

for doc in documents:
client.post(
"/documents",
files={"file": (
"document.txt",
doc["text"].encode(),
"text/plain",
)},
data={
"title": doc.get("source", "Imported document"),
"source_url": doc.get("source", ""),
},
)
for (const doc of exportedDocs) {
const formData = new FormData();
formData.append("file", new Blob([doc.text], { type: "text/plain" }), "document.txt");
formData.append("title", doc.source || "Imported document");
formData.append("source_url", doc.source || "");

await fetch("https://spectron.surrealdb.com/api/v1/my-context/documents", {
method: "POST",
headers: { "API-KEY": "mgmt_..." },
body: formData,
});
}

Spectron's ingestion pipeline handles chunking internally. You do not need to replicate the chunking strategy from your vector store.

Vector stores often have minimal, inconsistent, or missing metadata. Spectron's pipeline extracts structure from content, so missing metadata is less critical – but it is worth enriching documents before ingestion if you have source information available.

If your vector store metadata includes document type, date, or author information, include it in the document upload:

client.post(
"/documents",
files={"file": ("policy.md", content, "text/markdown")},
data={
"title": "Return policy",
"content_type": "policy",
"authored_at": "2024-01-15T00:00:00Z",
},
)

If your vector store also held per-user conversational memory (previous chat turns or extracted facts stored as vectors), re-ingest them as Spectron turns:

# For each user, create a session and ingest their history
for user_id, history in user_histories.items():
session = await memory.sessions.create(
scope={"org": "my-org", "user": user_id},
)
for message in history:
await session.turn(role=message["role"], content=message["content"])
for (const [userId, history] of Object.entries(userHistories)) {
const session = await memory.sessions.create({
scope: { org: "my-org", user: userId },
});
for (const message of history) {
await session.turn({ role: message.role, content: message.content });
}
}

The extraction pipeline runs on each turn and re-derives structured entities and attributes from the conversation history. You do not need to manually map old vector metadata to Spectron's schema.

You do not need to cut over immediately. Run Spectron and your vector store in parallel for a period:

  1. Write to both – record turns in Spectron and continue writing to the vector store.

  2. Read from Spectron first – use Spectron's context retrieval as primary; fall back to the vector store if Spectron returns nothing.

  3. Validate – compare the quality of responses with Spectron context versus vector store context over a sample of real queries.

  4. Cut over – once satisfied, remove the vector store read path.

async def retrieve_context(user_id: str, query: str) -> str:
# Try Spectron first
ctx = await session.context(query=query, top_k=6)
if ctx.items:
return ctx.formatted

# Fall back to vector store during transition
chunks = vector_store.search(query=query, user_id=user_id, top_k=6)
return "\n\n".join(c["text"] for c in chunks)
WeekActivity
1Export documents; begin authoritative knowledge ingestion
2Begin recording new conversations as Spectron turns
3–4Coexistence: Spectron primary, vector store fallback
5Validate response quality; remove fallback
6+Re-ingest historical conversational memory if needed

Was this page helpful?