Time elapsed: 2w1d
Your frustration from the day before is gone, and now you are feeling inspired. And it’s not just because you are starting to understand how databases work. It’s because you are starting to understand why.
As always, you spent your morning with a book in one hand and a cup of tea in the other. Today’s book was about the invention of writing in two very old countries called Egypt and Sumer.
You were surprised to find out that inspiration had nothing to do with the invention of writing! Instead, writing was invented for some pretty boring reasons.
At the time, people had stopped hunting for a living and now lived on farms and in villages. Sometimes they produced extra food and could trade it, or wanted to settle an argument on who owned what. Governments settled ownership arguments between people and in exchange they demanded a tax. They then needed to plan how to use these taxes for their upcoming projects.
Very boring stuff! But to do all that, humans needed writing, and so they invented it! That’s all there was to it.
So maybe it’s the basic writing down and tracking of information that matters most? Maybe data IS what drives society! And if that is the case, then perhaps you don’t have to worry about exactly how to restore civilization. All the inspiring philosophy and literature will just…take care of itself. Maybe just making the information available will do the job!
With renewed enthusiasm, you close the book and get back to your database studies.
It’s now time to learn how to create bi-directional relations in SurrealDB by using the RELATE
keyword. Understanding this keyword is so important that the whole chapter will be dedicated to it!
Before we do, here is a quick summary of where we were at the end of the last chapter. At this point we have a schema that looks like this:
And inside the database are two records. You should see them in the output of the following query:
SELECT * FROM person FETCH properties;
Response[ { class: 'Count', id: person:the_nobleman, money: 50, name: 'The Nobleman', properties: [ { id: building:old_castle, kind: 'castle', name: 'Old castle' } ] } ]
While the embedded code samples in this and the next few chapters include the dataset from the previous chapter, you might also want to take a look at it yourself. You can find it here, and then copy and paste the statements inside to bring yourself up to date and ready for the queries to follow.
So let’s get back to the last sentence we saw, about the Nobleman after he lost all of his money.
All he had left was an old castle, his wife, and three daughters: Wulfield, Adelaide, and Bertha.
That means that we still have to create person
records for his wife and daughters. The Tale of the Enchanted Knights never indicates the name of either the nobleman or his wife, so we’ll choose The Noblewoman for her name in the same way that we chose the Nobleman’s name. We know the names of the others. Note that women could inherit property in the medieval ages too, so the Nobleman’s wife will also be linked to the old castle via the properties
field.
CREATE person:the_noblewoman SET name = "The Noblewoman", properties = [building:old_castle]; CREATE person:wulfield SET name = "Wulfield"; CREATE person:adelaide SET name = "Adelaide"; CREATE person:bertha SET name = "Bertha";
It’s almost time to join them with the RELATE
keyword, but first let’s take a look at the record link method we used in the last chapter and why we probably don’t want to use it.
If we wanted to use it to link the Nobleman to Wulfield, we would do it like this with a field called children
:
UPDATE person:the_nobleman SET children = [person:wulfield];
However, nothing happens! That’s because the schema doesn’t have a field called children
.
Response[ { class: 'Count', id: person:the_nobleman, money: 50, name: 'The Nobleman', properties: [ building:old_castle ] } ]
The important point for record links is this: they are fields on a record. And our strictly defined schema doesn’t have a field called children
, so it didn’t show up. And if we add a field called children
to the person
field then we should also add parents
. We would have to update parents to add their children, and then update the children to add their parents. It might work, but there is an easier way: using RELATE
.
Let’s give RELATE
a try and see what happens. The syntax for RELATE
looks like this:
RELATE some:record->relation_name->some:other_record;
Note that it doesn’t use CREATE
anywhere; it’s a different keyword. The relation_name
is usually a verb or a similar word that makes queries readable. Some good examples are “wrote”, or “created”, or “likes”, or “parent_of” in our case. That means that our RELATE
statement should look like this.
RELATE person:the_nobleman->parent_of->person:wulfield;
And now let’s see what the Nobleman’s record looks like once the relation is added.
RELATE person:the_nobleman->parent_of->person:wulfield; SELECT * FROM person:the_nobleman;
Response[ { class: 'Count', id: person:the_nobleman, money: 50, name: 'The Nobleman', properties: [ building:old_castle ] } ]
It’s exactly the same! But our RELATE
statement actually did work, because a RELATE
statement doesn’t modify records. Instead, it created its own record from the parent_of
table that links the two, often called a “graph table” or a “graph edge”. We can prove that it exists by typing INFO FOR DB
, which contains a statement showing that there is indeed a separate table with this name.
INFO FOR DB;
parent_of: 'DEFINE TABLE parent_of TYPE ANY SCHEMALESS PERMISSIONS NONE',
Let’s see what’s inside!
SELECT * FROM parent_of;
Response[ { id: parent_of:wzcwzj1f64clq5hgkctp, in: person:the_nobleman, out: person:wulfield } ]
As the output shows, a graph table even has its own ID.
You can specify the ID for a graph table too if you want:
RELATE person:the_nobleman->parent_of:first_relation->person:wulfield;
This is quite rare as usually graph tables are queried from other records, but the possibility exists!
The most interesting part about these tables, however, is that they all have an in
and an out
field. An in
is the record that is linking to, while out
is the record that is being linked to. This is why the syntax uses arrows. In our query, one arrow shows the_nobleman
going in
(->
) to parent_of
, and then the relation moves out
of parent_of
into person_wulfield
.
person:the_nobleman->parent_of->person:wulfield
Thinking in terms of active tense (created, saw) and passive tense (created_by, seen) is a good way to visualize the difference between in
and out
. So if you were to relate two records using a table called “wrote”, in
would be the “writer” and out
would be the “thing written”.
Relational queries get more interesting when you have more relations to work with. We can make ours more interesting by adding the relations between the Noblewoman and remaining daughters.
-- Wulfield has already been related to the Nobleman, so first RELATE her to her mom... RELATE person:the_noblewoman->parent_of->person:wulfield; -- And then RELATE for each of the parents for each of the other two daughters FOR $daughter IN [person:adelaide, person:bertha] { RELATE person:the_nobleman->parent_of->$daughter; RELATE person:the_noblewoman->parent_of->$daughter; }
With these relations established, let’s use a quick query to confirm that they are what we expect them to be. We’ll combine some skills we learned in previous chapters to make the output readable:
VALUE
to return the value of the name
properties,<string>
to cast them into a string,+
to concatenate them so the output is readable.
SELECT value "Parent: " + <string>in.name + ". Daughter: " + <string>out.name FROM parent_of;
The output is indeed quite readable!
[ "Parent: The Nobleman. Daughter: Wulfield", "Parent: The Nobleman. Daughter: Bertha", "Parent: The Nobleman. Daughter: Wulfield", "Parent: The Noblewoman. Daughter: Adelaide", "Parent: The Noblewoman. Daughter: Bertha", "Parent: The Noblewoman. Daughter: Wulfield" ]
It looks like querying on a graph table is easy enough. But what if we want to query on a person
and see what they are a parent_of
? This parent_of
is an entirely separate table, so the query below won’t give us the results we want:
SELECT parent_of FROM person;
Response[ { parent_of: NONE }, { parent_of: NONE } -- ...and so on for the other three person records ]
This is where we see the arrow syntax again, which is used not just to create relations but also to query them.
->
and <-
to traverse graph edgesAs we saw just now, SELECT parent_of FROM person
won’t return any results. But let’s see what happens when we add a ->
or a <-
to the front of the name of the graph edge. Let’s first try adding ->
to the query.
SELECT ->parent_of FROM person;
Response[ { "->parent_of": [] }, { "->parent_of": [] }, { "->parent_of": [ parent_of:jgbpq67artlnrfnbbyd0, parent_of:m2ral1i75bhukmctfgfi, parent_of:wzcwzj1f64clq5hgkctp ] }, { "->parent_of": [ parent_of:hw8g585r7ord9unfehfp, parent_of:rfgj2tw2c0w6gsogecda, parent_of:se05znnppigsztobyrjy ] }, { "->parent_of": [] } ]
Ah ha! We have five results, of which two show up. We know that there are two parents, each of which is linked to three daughters, so the records with data must be the parents. Note that the results are of type parent_of
, not of person
.
Changing ->
to <-
should give us the opposite result.
SELECT <-parent_of FROM person;
Response[ { "<-parent_of": [ parent_of:m2ral1i75bhukmctfgfi, parent_of:rfgj2tw2c0w6gsogecda ] }, { "<-parent_of": [ parent_of:hw8g585r7ord9unfehfp, parent_of:jgbpq67artlnrfnbbyd0 ] }, { "<-parent_of": [] }, { "<-parent_of": [] }, { "<-parent_of": [ parent_of:se05znnppigsztobyrjy, parent_of:wzcwzj1f64clq5hgkctp ] } ]
Indeed it does! Switching ->
to <-
now gives us the out
edges to the daughters, namely the graph tables of those who are “parented by” someone.
So what if we want to reach the actual person
records that are linked in this way? Here we can add the table name we want to look for, which in this case is person
- because our RELATE
statement joins a person
to a person
. (We could have RELATE
d a person
to a building
or anything else.)
We know that the person
type has a name
field, so we can also add .name
to return their names.
SELECT ->parent_of->person.name FROM person;
Response[ { "->parent_of": { "->person": { name: [] } } }, { "->parent_of": { "->person": { name: [] } } }, { "->parent_of": { "->person": { name: [ 'Bertha', 'Adelaide', 'Wulfield' ] } } }, { "->parent_of": { "->person": { name: [ 'Bertha', 'Adelaide', 'Wulfield' ] } } }, { "->parent_of": { "->person": { name: [] } } } ]
That output was good, but a little verbose. Let’s use the VALUE
keyword to only return the values (the names).
SELECT VALUE ->parent_of->person.name FROM person;
Response[ [], [], [], [ "Wulfield", "Bertha", "Adelaide" ], [ "Adelaide", "Wulfield", "Bertha" ] ]
That’s better! This cleaner output will help us practice the difference between ->
and <-
. Because with two places to use these arrows, and two arrows to choose from, that gives us four total ways we could pick a direction to see different results. Let’s give each of them a try!
Here are two important tips before we start practicing:
Remember that ->
isn’t shorthand for in
, and <-
isn’t shorthand for out
. In fact, it’s easier than that: the direction of the arrow alone in relation to the record or graph edge determines this. An arrow pointing towards something is in
, and an arrow pointing out is out
.
FROM person
at the end actually means that this is the starting point of the query. This is actually the case for all SELECT
queries.
Take this query for example:
SELECT name, age FROM person;
As humans, we like to think of the query in this way:
“Give me the name
and age
of all person
records”.
But from the point of view of the database it is the other way around:
“From all person
records, return name
and age
”.
So reading the FROM
part of a query first can help understand this behaviour when a query gets complicated.
Since our arrows can face either ->
or <-
, that gives us four possible directions when querying on parent_of
:
->parent_of->
<-parent_of->
->parent_of<-
<-parent_of<-
Let’s take a look at each of these directions one step at a time to make sure that we understand them.
FROM person
: start from person
. Where are we going?->parent_of
: The arrow points to parent_of
, to the records connected via in
: the parents.parent_of->person
: Goes from parent_of
to person
via the out
field,.name
: and accesses their names.Who is located at out
? The daughters. Result: the names of the daughters through the parents.
SELECT VALUE ->parent_of->person.name FROM person;
Response[ [], [], [], [ "Wulfield", "Bertha", "Adelaide" ], [ "Adelaide", "Wulfield", "Bertha" ] ]
SELECT VALUE ->parent_of<-person.name FROM person;
FROM person
: start from person
. Where are we going?->parent_of
: The arrow points to parent_of
. That’s the records connected via in
: the parents.parent_of<-person
: Goes from parent_of
to person
via the in
field,.name
: and accesses their names.Who is located at in
? The parents. Result: the names of the parents through the parents.
SELECT VALUE ->parent_of<-person.name FROM person;
Response[ [], [], [], [ "The Noblewoman", "The Noblewoman", "The Noblewoman" ], [ "The Nobleman", "The Nobleman", "The Nobleman" ] ]
SELECT VALUE <-parent_of->person.name FROM person;
FROM person
: start from person
. Where are we going?<-parent_of
: The arrow points away from parent_of
. That’s the records connected via out
: the daughters.parent_of->person
: Goes from parent_of
to person
via the out
field,.name
: and accesses their names.Who is located at out
? The daughters. Result: the names of the daughters through the daughters.
SELECT VALUE <-parent_of->person.name FROM person;
Response[ [ "Bertha", "Bertha" ], [ "Wulfield", "Wulfield" ], [ "Adelaide", "Adelaide" ], [], [] ]
SELECT VALUE <-parent_of<-person.name FROM person;
FROM person
: start from person
. Where are we going?<-parent_of
: The arrow points away from parent_of
. That’s the records connected via out
: the daughters.parent_of<-person
: Goes from parent_of
to person
via the in
field,.name
: and accesses their names.Who is located at in
? The parents. Result: the names of the parents through the daughters.
SELECT VALUE <-parent_of<-person.name FROM person;
Response[ [ "The Noblewoman", "The Nobleman" ], [ "The Noblewoman", "The Nobleman" ], [ "The Nobleman", "The Noblewoman" ], [], [] ]
As the four examples demonstrate, these queries are generally used when you have arrows moving in the same direction (->parent_of->
, or <-parent_of<-
) because the arrow will represent an in
on one side and an out
on the other, or vice versa.
A long example is always best followed with a second example to let the concept sink in, so let’s look at one more to make sure that we understand.
Our first example involved the graph edges between two person
records. This next example is simpler, but uses different record types instead of just one.
We will create a user
named Aeon
who has two cats that must be fed. We will relate them via a graph edge called feeds
. Using RELATE
is starting to get easier for us by now:
CREATE user:aeon SET name = "Aeon"; CREATE cat:one SET name = "Mr. Meow"; CREATE cat:two SET name = "Mrs. Meow"; RELATE user:aeon->feeds->[cat:one, cat:two];
Interestingly, having two record types (user
related to cat
) instead of one (person
related to person
) makes the queries a bit easier to understand. That’s because the FROM user
(or FROM cat
) part at the end shows which table we are starting from.
As you can see, the feeds
graph edge always has user
records for in
and always cat
records for out
.
SELECT in, out FROM feeds;
Response[ { in: user:aeon, out: cat:two }, { in: user:aeon, out: cat:one } ]
We’ll start from a simple SELECT
and build up from there.
Who are the users?
SELECT name FROM user;
Response[ { name: 'Aeon' } ]
Looks like it’s just Aeon. Now what are the names of the cats that Aeon is feeding?
SELECT name, ->feeds->cat.name FROM user;
Response[ { "->feeds": { "->cat": { name: [ "Mrs. Meow", "Mr. Meow" ] } }, name: "Aeon" } ]
Now let’s look at it from the other angle.
Who are the cats?
SELECT name FROM cat;
Response[ { name: "Mr. Meow" }, { name: "Mrs. Meow" } ]
Okay, it’s Mr. and Mrs. Meow. What are the names of the users that they are being fed by?
SELECT name, <-feeds<-user.name FROM cat;
Response[ { "<-feeds": { "<-user": { name: [ 'Aeon' ] } }, name: 'Mr. Meow' }, { "<-feeds": { "<-user": { name: [ 'Aeon' ] } }, name: 'Mrs. Meow' } ]
This shows again that queries with arrows moving in the same direction next to a graph name are generally the ones you want. In our case, ->feeds->
and <-feeds<-
are most useful.
It is much rarer that you will want to use the directions <-graph_name->
or ->graph_name<-
, because the first accesses both out
fields and the second accesses both in
fields. But there is still some use to these two possible directions. Let’s give them a try as we think about what exactly a query of that form would mean.
This first query starts at the out
field of feeds
and moves on to the same out
field, both of which are cats.
SELECT <-feeds->cat FROM cat;
Mr. Meow and Mrs. Meow are each fed once[ { "<-feeds": { "->cat": [ cat:one ] } }, { "<-feeds": { "->cat": [ cat:two ] } } ]
So that gave us the names of cats that are each fed by…something (not sure what).
This next query starts at the in
field of feeds
and moves on to the same in
field, both of which are users.
SELECT ->feeds<-user.name FROM user;
Aeon is feeding something twice[ { "->feeds": { "<-user": [ user:aeon, user:aeon ] } } ]
So that gave us the name of a user that feeds…something (not sure what).
In other words, a <-relation_name->
can show how passively involved a record is in some relation, while ->relation_name<-
can show how actively involved a record is in some relation.
The destructuring syntax that we first saw way back in Chapter 1 is particularly helpful when making a graph query, since there can be a lot of typing involved before you reach the end of a query. This syntax was added in SurrealDB 2.0. Before version 2.0, you used to have to use the arrow syntax each time you wanted to access a field at the end of a graph query:
SELECT name, ->feeds->cat.name, ->feeds->cat.id FROM user;
But now this only takes a single line!
SELECT name, ->feeds->cat.{ id, name } FROM user;
Response[ { "->feeds": { "->cat": [ { id: cat:two, name: 'Mrs. Meow' }, { id: cat:one, name: 'Mr. Meow' } ] }, name: 'Aeon' } ]
There is a lot more you can do in SurrealDB with graph edges, because graph queries can continue as far as your imagination takes you. For example, what if you wanted to see the cats fed by a person…who is the parent of another person? Or in more human language: “Who are the cats fed by so-and-so’s parents?”
A graph traversal can do all of that too! But we have learned enough for this chapter and will save those fancy queries for the next one. See you in the next chapter!
Hold on a second, can you hear that sound?
Aeon’s day isn’t over yet. Let’s tune in to see what’s going on.
You hear the sound of horses’ hooves in the distance, and make your way to the tunnel entrance. It sounds like a dozen or more of them. Why so many?
All you can see at first is a cloud of dust, but soon the riders reach the top of the nearest hill and are close enough to see. At their head is Landevin, one of the young members of the town council and famous for being a brilliant speaker…and for his fancy name. Who gives their child a name like Landevin, anyway?
Landevin stops his horse and looks down towards you.
“Aeon, we’ve come from the town to have a serious talk with you about your project.”
This will depend on the context and how you want to structure your data, but it’s possible that simple record links could be the solution. It also feels right to give record links a bit of attention once again after a full chapter on nothing but graph relations.
You could create items that are accessed via a field on a character
table:
CREATE character:rondo SET name = "Rondo the Magnificent"; CREATE item:1 SET type = "weapon", damage = 10; CREATE item:2 SET type = "book", content = "Hail traveller and welcome to the..."; UPDATE character:rondo SET items += item:1; UPDATE character:rondo SET items += item:2; // Then access the items using FETCH or .*.* SELECT * FROM character:rondo FETCH items; SELECT *, items.*.* FROM character:rondo;
Or you could also assign a single owner to each item and query them inside a subquery.
CREATE character:rondo SET name = "Rondo the Magnificent"; CREATE item:1 SET type = "weapon", damage = 10, owner = character:rondo; CREATE item:2 SET type = "book", content = "Hail traveller and welcome to the...", owner = character:rondo; SELECT *, (SELECT * FROM item WHERE owner = character:rondo) AS items FROM character:rondo;
As the example above shows, you can always style a query by adding a subquery if you want to emulate a bidirectional relation. So wanting to query both a character
and an item
at the same time doesn’t absolutely mean turning to a record link.
SELECT *, (SELECT * FROM item WHERE owner = character:rondo) AS items FROM character:rondo;
However, a record link does become the right choice if you want to add information about the relationship itself. This example shows a relation that includes information on when and where an item was picked up, which is very easy to do using a relation.
CREATE character:rondo SET name = "Rondo the Magnificent"; CREATE item:1 SET type = "weapon", damage = 10; LET $current_location = { x: 50, y: 30 }; RELATE character:rondo->owns->item:1 SET since = time::now(), picked_up_at = $current_location; SELECT <-owns.* AS owner_details FROM ONLY item:1;
Response{ owner_details: [ { id: owns:gex77koyxdnb5vbuoo2m, in: character:rondo, out: item:1, picked_up_at: { x: 50, y: 30 }, since: d'2024-07-29T02:05:40.716Z' } ] }
defeated
that keeps track of historical battles?The data below showing a few Roman victories and losses shows how you can do it. Since we are using the RELATE
statement to create relations, we can also keep track of these battles such as the date when they happened.
CREATE country:rome SET name = "Rome"; CREATE country:etruria SET name = "Etruria"; CREATE country:senones SET name = "Senones"; RELATE country:rome->defeated->country:etruria SET defeated_at = <datetime>"-0264-01-01"; RELATE country:senones->defeated->country:rome SET defeated_at = <datetime>"-0387-07-18"; RELATE country:rome->defeated->country:carthage SET defeated_at = <datetime>"-0146-01-01";
The sky is the limit when it comes to these types of queries, but here are two to get you started. You can query starting from country
or from the defeated
table itself.
SELECT name, ->defeated->country AS won_against, <-defeated<-country AS lost_to FROM country;
Response[ { lost_to: [ country:rome ], name: 'Carthage', won_against: [] }, { lost_to: [ country:rome ], name: 'Etruria', won_against: [] }, { lost_to: [ country:senones ], name: 'Rome', won_against: [ country:etruria, country:carthage ] }, { lost_to: [], name: 'Senones', won_against: [ country:rome ] } ]
SELECT defeated_at, in.name AS winner, out.name AS loser FROM defeated;
Response[ { defeated_at: '-0264-01-01T00:00:00Z', loser: 'Etruria', winner: 'Rome' }, { defeated_at: '-0146-01-01T00:00:00Z', loser: 'Carthage', winner: 'Rome' }, { defeated_at: '-0387-07-18T00:00:00Z', loser: 'Rome', winner: 'Senones' } ]
One way to do this is to use the count()
function to compare the number of a country’s victories versus the number of its defeats. As ->defeated
will return an array of the victories and <-defeated
will return the opposite, these can be fed into the count()
function.
SELECT * FROM country WHERE count(->defeated) >= count(<-defeated);
Response[ { id: country:rome, name: 'Rome' }, { id: country:senones, name: 'Senones' } ]