Surreal Cloud Enterprise is now available
Sign up to our Early Access Programme

Make a GenAI chatbot using GraphRAG with SurrealDB + LangChain

featured engineering

Jun 30, 2025

Martin Schaer
Martin Schaer
Show all posts
Make a GenAI chatbot using GraphRAG with SurrealDB + LangChain

What is traditional RAG

Retrieval-Augmented Generation (RAG) is an AI technique that enhances the capabilities of large language models (LLMs) by allowing them to retrieve relevant information from a knowledge base before generating a response. It uses vector similarity search to find the most relevant document chunks, which are then provided as additional context to the LLM, enabling the LLM to produce more accurate and grounded responses.

What is GraphRAG

Graph RAG is an advanced technique. It leverages the structured, interconnected nature of knowledge graphs to provide the LLM with a richer, more contextualised understanding of the information, leading to more accurate, coherent, and less “hallucinated” responses.

For this example we are going to use LangChain Python components, Ollama, and SurrealDB.

Flow overview

  1. Ingest data (categorised health symptoms and common treatments)
  2. Ask the user about their symptoms
  3. Find relevant documents in the DB by similarity
  4. Execute vector search in DB
  5. Invoke chain to find related common treatments
  6. Chain asks the LLM to generate graph query
  7. Chain executes the query
  8. Query graph
  9. Chain asks the LLM to summarize the results and generate an answer
  10. Respond to the user

Communication diagram for the GraphRAG solution and the flow listed above

First step: ingest the data

For this example we have a YAML file with categorised symptoms and their common treatments.

We want to store this in a vector store so we can query it using vector similarity search.

We also want to represent the data relations in a graph store, so we can run graph queries to retrieve those relationships (e.g. treatments related to symptoms).

- category: General Symptoms symptoms: - name: Fever description: Elevated body temperature, usually above 100.4°F (38°C). medical_practice: General Practice, Internal Medicine, Pediatrics possible_treatments: - Antipyretics (e.g., ibuprofen, acetaminophen) - Rest - Hydration - Treating the underlying cause

Let’s instantiate the following LangChain python components:

…and create a SurrealDB connection:

# DB connection conn = Surreal(url) conn.signin({"username": user, "password": password}) conn.use(ns, db) # Vector Store vector_store = SurrealDBVectorStore( OllamaEmbeddings(model="llama3.2"), conn ) # Graph Store graph_store = SurrealDBGraph(conn)

Note that the SurrealDBVectorStore is instantiated with OllamaEmbeddings . This LLM model will be used when inserting documents to generate their embeddings vector.

Populating the vector store

With the vector store instantiated, we are now ready to populate it.

# Parsing the YAML into a Symptoms dataclass with open("./symptoms.yaml", "r") as f: symptoms = yaml.safe_load(f) assert isinstance(symptoms, list), "failed to load symptoms" for category in symptoms: parsed_category = Symptoms(category["category"], category["symptoms"]) for symptom in parsed_category.symptoms: parsed_symptoms.append(symptom) symptom_descriptions.append( Document( page_content=symptom.description.strip(), metadata=asdict(symptom), ) ) # This calculates the embeddings and inserts the documents into the DB vector_store.add_documents(symptom_descriptions)

Stitching the graph together

# Find nodes and edges (Treatment -> Treats -> Symptom) for idx, category_doc in enumerate(symptom_descriptions): # Nodes treatment_nodes = {} symptom = parsed_symptoms[idx] symptom_node = Node(id=symptom.name, type="Symptom", properties=asdict(symptom)) for x in symptom.possible_treatments: treatment_nodes[x] = Node(id=x, type="Treatment", properties={"name": x}) nodes = list(treatment_nodes.values()) nodes.append(symptom_node) # Edges relationships = [ Relationship(source=treatment_nodes[x], target=symptom_node, type="Treats") for x in symptom.possible_treatments ] graph_documents.append( GraphDocument(nodes=nodes, relationships=relationships, source=category_doc) ) # Store the graph graph_store.add_graph_documents(graph_documents, include_source=True)

Data ready, let’s chat

LangChain provides different chat models. We are going to use ChatOllama with llama3.2 to generate a graph query and to explain the result in natural language.

chat_model = ChatOllama(model="llama3.2", temperature=0)

To generate the graph query based on the user’s prompt, we need to instantiate a QA (Questioning and Answering) Chain component. In this case we are using SurrealDBGraphQAChain .

But before querying the graph, we need to find the symptoms in our vector store by doing a similarity search based on the user’s prompt.

query = click.prompt( click.style("\\nWhat are your symptoms?", fg="green"), type=str ) # -- Find relevant docs docs = vector_search(query, vector_store, k=3) symptoms = get_document_names(docs) # -- Query the graph chain = SurrealDBGraphQAChain.from_llm( chat_model, graph=graph_store, verbose=verbose, query_logger=query_logger, ) ask(f"what medical practices can help with {symptoms}", chain) ask(f"what treatments can help with {symptoms}", chain)

Running

Clone the repository and follow the instructions in the README of the graph example.

Running the program will look like this:

What are your symptoms?: i have a runny nose and itchy eyes

The script tries marginal relevance and similarity searches in the vector store to compare the results, which helps to choose the right one for your specific use case.

max_marginal_relevance_search: - Stuffy nose due to inflamed nasal passages or a dripping nose with mucus discharge. - An uncomfortable sensation that makes you want to scratch, often without visible skin changes. - Feeling lightheaded, unsteady, or experiencing a sensation that the room is spinning. similarity_search_with_score - [40%] Stuffy nose due to inflamed nasal passages or a dripping nose with mucus discharge. - [33%] Feeling lightheaded, unsteady, or experiencing a sensation that the room is spinning. - [32%] Pain, irritation, or scratchiness in the throat, often made worse by swallowing.

Then, the QA chain will generate and run a graph query behind the scenes, and generate the responses.

This script is asking our AI two questions based on the user’s symptoms:

  • Question: what medical practices can help with Nasal Congestion/Runny Nose, Dizziness/Vertigo, Sore Throat
  • Question: what treatments can help with Nasal Congestion/Runny Nose, Dizziness/Vertigo, Sore Throat

For the first question the QA chain component generated this graph query:

SELECT <-relation_Attends<-graph_Practice AS practice FROM graph_Symptom WHERE name IN ["Nasal Congestion/Runny Nose", "Dizziness/Vertigo", "Sore Throat"];

The result of this query –a Python list of dictionaries containing the medical practice names– are fed to the LLM to generate a nice human readable answer:

Here is a summary of the medical practices that can help with Nasal Congestion/Runny Nose, Dizziness/Vertigo, and Sore Throat: Several medical practices may be beneficial for individuals experiencing symptoms such as Nasal Congestion/Runny Nose, Dizziness/Vertigo, and Sore Throat. These include Neurology, ENT (Otolaryngology), General Practice, and Allergy & Immunology. Neurology specialists can provide guidance on managing conditions that affect the nervous system, which may be related to dizziness or vertigo. ENT (Otolaryngology) specialists focus on ear, nose, and throat issues, making them a good fit for addressing nasal congestion and runny nose symptoms. General Practice physicians offer comprehensive care for various health concerns, including those affecting the respiratory system. Allergy & Immunology specialists can help diagnose and treat allergies that may contribute to Nasal Congestion/Runny Nose, as well as provide immunological support for overall health.

The query for the second question (What treatments can help with Nasal Congestion/Runny Nose, Dizziness/Vertigo, Sore Throat), looks like this:

SELECT <-relation_Treats<-graph_Treatment as treatment FROM graph_Symptom WHERE name IN ["Nasal Congestion/Runny Nose", "Dizziness/Vertigo", "Sore Throat"]

The LLM will then produce the following output:

Here is a summary of the treatments that can help with Nasal Congestion/Runny Nose, Dizziness/Vertigo, and Sore Throat: The following treatments have been found to be effective in alleviating symptoms: - Vestibular rehabilitation - Hydration - Medications to reduce nausea or dizziness - Antihistamines (for allergies) - Decongestants (oral or nasal sprays) - Saline nasal rinses - Humidifiers - Throat lozenges/sprays - Treating underlying cause (e.g., cold, allergies) - Pain relievers (e.g., acetaminophen, ibuprofen) - Warm salt water gargles

Ready to build?

Find all the code in the repository examples.

Get started for free with Surreal Cloud.

Any questions or thoughts about this or semantic search using SurrealDB? Feel free to drop by our community to get in touch.

The state of Agentic AI and the need for Agentic Memory

company

The state of Agentic AI and the need for Agentic Memory

Jun 27, 2025

Announcing our official LangChain integration

engineering

Announcing our official LangChain integration

Jun 30, 2025

Get insider access to Surreal's latest news and events