Retrieval-Augmented Generation (RAG) is an AI technique that enhances the capabilities of large language models (LLMs) by allowing them to retrieve relevant information from a knowledge base before generating a response. It uses vector similarity search to find the most relevant document chunks, which are then provided as additional context to the LLM, enabling the LLM to produce more accurate and grounded responses.
Graph RAG is an advanced technique. It leverages the structured, interconnected nature of knowledge graphs to provide the LLM with a richer, more contextualised understanding of the information, leading to more accurate, coherent, and less “hallucinated” responses.
For this example we are going to use LangChain Python components, Ollama, and SurrealDB.
For this example we have a YAML file with categorised symptoms and their common treatments.
We want to store this in a vector store so we can query it using vector similarity search.
We also want to represent the data relations in a graph store, so we can run graph queries to retrieve those relationships (e.g. treatments related to symptoms).
Let’s instantiate the following LangChain python components:
…and create a SurrealDB connection:
Note that the SurrealDBVectorStore
 is instantiated with OllamaEmbeddings
. This LLM model will be used when inserting documents to generate their embeddings vector.
With the vector store instantiated, we are now ready to populate it.
LangChain provides different chat models. We are going to use ChatOllama
 with llama3.2
 to generate a graph query and to explain the result in natural language.
chat_model = ChatOllama(model="llama3.2", temperature=0)
To generate the graph query based on the user’s prompt, we need to instantiate a QA (Questioning and Answering) Chain component. In this case we are using SurrealDBGraphQAChain
.
But before querying the graph, we need to find the symptoms in our vector store by doing a similarity search based on the user’s prompt.
Clone the repository and follow the instructions in the README of the graph example.
Running the program will look like this:
What are your symptoms?: i have a runny nose and itchy eyes
The script tries marginal relevance and similarity searches in the vector store to compare the results, which helps to choose the right one for your specific use case.
Then, the QA chain will generate and run a graph query behind the scenes, and generate the responses.
This script is asking our AI two questions based on the user’s symptoms:
For the first question the QA chain component generated this graph query:
The result of this query –a Python list of dictionaries containing the medical practice names– are fed to the LLM to generate a nice human readable answer:
The query for the second question (What treatments can help with Nasal Congestion/Runny Nose, Dizziness/Vertigo, Sore Throat), looks like this:
The LLM will then produce the following output:
Find all the code in the repository examples.
Get started for free with Surreal Cloud.
Any questions or thoughts about this or semantic search using SurrealDB? Feel free to drop by our community to get in touch.
company
Jun 27, 2025 4 min read
engineering
Jul 1, 2025 1 min read
 
 Explore our releases, news, events, and much more