SurrealDB University is LIVE! Master the future of data

Chapter 6: Inspiration from the past Chapter 6: Inspiration from the past
Back to Courses

Chapter 6: Inspiration from the past

Time elapsed: 2w1d

Your frustration from the day before is gone, and now you are feeling inspired. And it’s not just because you are starting to understand how databases work. It’s because you are starting to understand why.

As always, you spent your morning with a book in one hand and a cup of tea in the other. Today’s book was about the invention of writing in two very old countries called Egypt and Sumer.

You were surprised to find out that inspiration had nothing to do with the invention of writing! Instead, writing was invented for some pretty boring reasons.

At the time, people had stopped hunting for a living and now lived on farms and in villages. Sometimes they produced extra food and could trade it, or wanted to settle an argument on who owned what. Governments settled ownership arguments between people and in exchange they demanded a tax. They then needed to plan how to use these taxes for their upcoming projects.

Very boring stuff! But to do all that, humans needed writing, and so they invented it! That’s all there was to it.

So maybe it’s the basic writing down and tracking of information that matters most? Maybe data IS what drives society! And if that is the case, then perhaps you don’t have to worry about exactly how to restore civilization. All the inspiring philosophy and literature will just…take care of itself. Maybe just making the information available will do the job!

With renewed enthusiasm, you close the book and get back to your database studies.

RELATE

It’s now time to learn how to create bi-directional relations in SurrealDB by using the RELATE keyword. Understanding this keyword is so important that the whole chapter will be dedicated to it!

Before we do, here is a quick summary of where we were at the end of the last chapter. At this point we have a schema that looks like this:

And inside the database are two records. You should see them in the output of the following query:

SELECT * FROM person FETCH properties;
Response
[ { class: 'Count', id: person:the_nobleman, money: 50, name: 'The Nobleman', properties: [ { id: building:old_castle, kind: 'castle', name: 'Old castle' } ] } ]

While the embedded code samples in this and the next few chapters include the dataset from the previous chapter, you might also want to take a look at it yourself. You can find it here, and then copy and paste the statements inside to bring yourself up to date and ready for the queries to follow.

So let’s get back to the last sentence we saw, about the Nobleman after he lost all of his money.

All he had left was an old castle, his wife, and three daughters: Wulfield, Adelaide, and Bertha.

That means that we still have to create person records for his wife and daughters. The Tale of the Enchanted Knights never indicates the name of either the nobleman or his wife, so we’ll choose The Noblewoman for her name in the same way that we chose the Nobleman’s name. We know the names of the others. Note that women could inherit property in the medieval ages too, so the Nobleman’s wife will also be linked to the old castle via the properties field.

CREATE person:the_noblewoman SET name = "The Noblewoman", properties = [building:old_castle]; CREATE person:wulfield SET name = "Wulfield"; CREATE person:adelaide SET name = "Adelaide"; CREATE person:bertha SET name = "Bertha";

It’s almost time to join them with the RELATE keyword, but first let’s take a look at the record link method we used in the last chapter and why we probably don’t want to use it.

If we wanted to use it to link the Nobleman to Wulfield, we would do it like this with a field called children:

UPDATE person:the_nobleman SET daughters = [person:wulfield];

However, nothing happens! That’s because the schema doesn’t have a field called children.

Response
[ { class: 'Count', id: person:the_nobleman, money: 50, name: 'The Nobleman', properties: [ building:old_castle ] } ]

The important point for record links is this: they are fields on a record. And our strictly defined schema doesn’t have a field called children, so it didn’t show up. And if we add a field called children to the person field then we should also add parents. We would have to update parents to add their children, and then update the children to add their parents. It might work, but there is an easier way: using RELATE.

Let’s give RELATE a try and see what happens. The syntax for RELATE looks like this:

RELATE some:record->relation_name->some:other_record;

Note that it doesn’t use CREATE anywhere; it’s a different keyword. The relation_name is usually a verb or a similar word that makes queries readable. Some good examples are “wrote”, or “created”, or “likes”, or “parent_of” in our case. That means that our RELATE statement should look like this.

RELATE person:the_nobleman->parent_of->person:wulfield;

And now let’s see what the Nobleman’s record looks like once the relation is added.

RELATE person:the_nobleman->parent_of->person:wulfield;
SELECT * FROM person:the_nobleman;
Response
[ { class: 'Count', id: person:the_nobleman, money: 50, name: 'The Nobleman', properties: [ building:old_castle ] } ]

It’s exactly the same! But our RELATE statement actually did work, because a RELATE statement doesn’t modify records. Instead, it created its own record from the parent_of table that links the two, often called a “graph table” or a “graph edge”. We can prove that it exists by typing INFO FOR DB, which contains a statement showing that there is indeed a separate table with this name.

INFO FOR DB;
parent_of: 'DEFINE TABLE parent_of TYPE ANY SCHEMALESS PERMISSIONS NONE',

Let’s see what’s inside!

SELECT * FROM parent_of;
Response
[ { id: parent_of:wzcwzj1f64clq5hgkctp, in: person:the_nobleman, out: person:wulfield } ]

As the output shows, a graph table even has its own ID.

You can specify the ID for a graph table too if you want:

RELATE person:the_nobleman->parent_of:first_relation->person:wulfield;

This is quite rare as usually graph tables are queried from other records, but the possibility exists!

The most interesting part about these tables, however, is that they all have an in and an out field. An in is the record that is linking to, while out is the record that is being linked to. This is why the syntax uses arrows. In our query, one arrow shows the_nobleman going in (->) to parent_of, and then the relation moves out of parent_of into person_wulfield.

person:the_nobleman->parent_of->person:wulfield

Thinking in terms of active tense (created, saw) and passive tense (created_by, seen) is a good way to visualize the difference between in and out. So if you were to relate two records using a table called “wrote”, in would be the “writer” and out would be the “thing written”.

Relational queries get more interesting when you have more relations to work with. We can make ours more interesting by adding the relations between the Noblewoman and remaining daughters.

-- Wulfield has already been related to the Nobleman, so first RELATE her to her mom... RELATE person:the_noblewoman->parent_of->person:wulfield; -- And then RELATE for each of the parents for each of the other two daughters FOR $daughter IN [person:adelaide, person:bertha] { RELATE person:the_nobleman->parent_of->$daughter; RELATE person:the_noblewoman->parent_of->$daughter; }

With these relations established, let’s use a quick query to confirm that they are what we expect them to be. We’ll combine some skills we learned in previous chapters to make the output readable:

  • VALUE to return the value of the name properties,
  • <string> to cast them into a string,
  • + to concatenate them so the output is readable.
SELECT value "Parent: " + <string>in.name + ". Daughter: " + <string>out.name FROM parent_of;

The output is indeed quite readable!

[ "Parent: The Nobleman. Daughter: Wulfield", "Parent: The Nobleman. Daughter: Bertha", "Parent: The Nobleman. Daughter: Wulfield", "Parent: The Noblewoman. Daughter: Adelaide", "Parent: The Noblewoman. Daughter: Bertha", "Parent: The Noblewoman. Daughter: Wulfield" ]

It looks like querying on a graph table is easy enough. But what if we want to query on a person and see what they are a parent_of? This parent_of is an entirely separate table, so the query below won’t give us the results we want:

SELECT parent_of FROM person;
Response
[ { parent_of: NONE }, { parent_of: NONE } # ...and so on for the other three person records ]

This is where we see the arrow syntax again, which is used not just to create relations but also to query them.

Using -> and <- to traverse graph edges

As we saw just now, SELECT parent_of FROM person won’t return any results. But let’s see what happens when we add a -> or a <- to the front of the name of the graph edge. Let’s first try adding -> to the query.

SELECT ->parent_of FROM person;
Response
[ { "->parent_of": [] }, { "->parent_of": [] }, { "->parent_of": [ parent_of:jgbpq67artlnrfnbbyd0, parent_of:m2ral1i75bhukmctfgfi, parent_of:wzcwzj1f64clq5hgkctp ] }, { "->parent_of": [ parent_of:hw8g585r7ord9unfehfp, parent_of:rfgj2tw2c0w6gsogecda, parent_of:se05znnppigsztobyrjy ] }, { "->parent_of": [] } ]

Ah ha! We have five results, of which two show up. We know that there are two parents, each of which is linked to three daughters, so the records with data must be the parents. Note that the results are of type parent_of, not of person.

Changing -> to <- should give us the opposite result.

SELECT <-parent_of FROM person;
Response
[ { "<-parent_of": [ parent_of:m2ral1i75bhukmctfgfi, parent_of:rfgj2tw2c0w6gsogecda ] }, { "<-parent_of": [ parent_of:hw8g585r7ord9unfehfp, parent_of:jgbpq67artlnrfnbbyd0 ] }, { "<-parent_of": [] }, { "<-parent_of": [] }, { "<-parent_of": [ parent_of:se05znnppigsztobyrjy, parent_of:wzcwzj1f64clq5hgkctp ] } ]

Indeed it does! Switching -> to <- now gives us the out edges to the daughters, namely the graph tables of those who are “parented by” someone.

So what if we want to reach the actual person records that are linked in this way? Here we can add the table name we want to look for, which in this case is person - because our RELATE statement joins a person to a person. (We could have RELATEd a person to a building or anything else.)

We know that the person type has a name field, so we can also add .name to return their names.

SELECT ->parent_of->person.name FROM person;
Response
[ { "->parent_of": { "->person": { name: [] } } }, { "->parent_of": { "->person": { name: [] } } }, { "->parent_of": { "->person": { name: [ 'Bertha', 'Adelaide', 'Wulfield' ] } } }, { "->parent_of": { "->person": { name: [ 'Bertha', 'Adelaide', 'Wulfield' ] } } }, { "->parent_of": { "->person": { name: [] } } } ]

That output was good, but a little verbose. Let’s use the VALUE keyword to only return the values (the names).

SELECT VALUE ->parent_of->person.name FROM person;
Response
[ [], [], [], [ "Wulfield", "Bertha", "Adelaide" ], [ "Adelaide", "Wulfield", "Bertha" ] ]

That’s better! This cleaner output will help us practice the difference between -> and <-. Because with two places to use these arrows, and two arrows to choose from, that gives us four total ways we could pick a direction to see different results. Let’s give each of them a try!

Here are two important tips before we start practicing:

  • Remember that -> isn’t shorthand for in, and <- isn’t shorthand for out. In fact, it’s easier than that: the direction of the arrow alone in relation to the record or graph edge determines this. An arrow pointing towards something is in, and an arrow pointing out is out.

  • FROM person at the end actually means that this is the starting point of the query. This is actually the case for all SELECT queries.

Take this query for example:

SELECT name, age FROM person;

As humans, we like to think of the query in this way:

“Give me the name and age of all person records”.

But from the point of view of the database it is the other way around:

“From all person records, return name and age”.

So reading the FROM part of a query first can help understand this behaviour when a query gets complicated.

Since our arrows can face either -> or <-, that gives us four possible directions when querying on parent_of:

  • ->parent_of->
  • <-parent_of->
  • ->parent_of<-
  • <-parent_of<-

Let’s take a look at each of these directions one step at a time to make sure that we understand them.

SELECT VALUE ->parent_of->person.name FROM person;

  • FROM person: start from person. Where are we going?
  • ->parent_of: The arrow points to parent_of, to the records connected via in: the parents.
  • parent_of->person: Goes from parent_of to person via the out field,
  • .name: and accesses their names.

Who is located at out? The daughters. Result: the names of the daughters through the parents.

SELECT VALUE ->parent_of->person.name FROM person;
Response
[ [], [], [], [ "Wulfield", "Bertha", "Adelaide" ], [ "Adelaide", "Wulfield", "Bertha" ] ]

SELECT VALUE ->parent_of<-person.name FROM person;

  • FROM person: start from person. Where are we going?
  • ->parent_of: The arrow points to parent_of. That’s the records connected via in: the parents.
  • parent_of<-person: Goes from parent_of to person via the in field,
  • .name: and accesses their names.

Who is located at in? The parents. Result: the names of the parents through the parents.

SELECT VALUE ->parent_of<-person.name FROM person;
Response
[ [], [], [], [ "The Noblewoman", "The Noblewoman", "The Noblewoman" ], [ "The Nobleman", "The Nobleman", "The Nobleman" ] ]

SELECT VALUE <-parent_of->person.name FROM person;

  • FROM person: start from person. Where are we going?
  • <-parent_of: The arrow points away from parent_of. That’s the records connected via out: the daughters.
  • parent_of->person: Goes from parent_of to person via the out field,
  • .name: and accesses their names.

Who is located at out? The daughters. Result: the names of the daughters through the daughters.

SELECT VALUE <-parent_of->person.name FROM person;
Response
[ [ "Bertha", "Bertha" ], [ "Wulfield", "Wulfield" ], [ "Adelaide", "Adelaide" ], [], [] ]

SELECT VALUE <-parent_of<-person.name FROM person;

  • FROM person: start from person. Where are we going?
  • <-parent_of: The arrow points away from parent_of. That’s the records connected via out: the daughters.
  • parent_of<-person: Goes from parent_of to person via the in field,
  • .name: and accesses their names.

Who is located at in? The parents. Result: the names of the parents through the daughters.

SELECT VALUE <-parent_of<-person.name FROM person;
Response
[ [ "The Noblewoman", "The Nobleman" ], [ "The Noblewoman", "The Nobleman" ], [ "The Nobleman", "The Noblewoman" ], [], [] ]

As the four examples demonstrate, these queries are generally used when you have arrows moving in the same direction (->parent_of->, or <-parent_of<-) because the arrow will represent an in on one side and an out on the other, or vice versa.

A long example is always best followed with a second example to let the concept sink in, so let’s look at one more to make sure that we understand.

More traversing graph edges

Our first example involved the graph edges between two person records. This next example is simpler, but uses different record types instead of just one.

We will create a user named Aeon who has two cats that must be fed. We will relate them via a graph edge called feeds. Using RELATE is starting to get easier for us by now:

CREATE user:aeon SET name = "Aeon"; CREATE cat:one SET name = "Mr. Meow"; CREATE cat:two SET name = "Mrs. Meow"; RELATE user:aeon->feeds->[cat:one, cat:two];

Interestingly, having two record types (user related to cat) instead of one (person related to person) makes the queries a bit easier to understand. That’s because the FROM user (or FROM cat) part at the end shows which table we are starting from.

As you can see, the feeds graph edge always has user records for in and always cat records for out.

SELECT in, out FROM feeds;
Response
[ { in: user:aeon, out: cat:two }, { in: user:aeon, out: cat:one } ]

We’ll start from a simple SELECT and build up from there.

Who are the users?

SELECT name FROM user;
Response
[ { name: 'Aeon' } ]

Looks like it’s just Aeon. Now what are the names of the cats that Aeon is feeding?

SELECT name, ->feeds->cat.name FROM user;
Response
[ { "->feeds": { "->cat": { name: [ "Mrs. Meow", "Mr. Meow" ] } }, name: "Aeon" } ]

Now let’s look at it from the other angle.

Who are the cats?

SELECT name FROM cat;
Response
[ { name: "Mr. Meow" }, { name: "Mrs. Meow" } ]

Okay, it’s Mr. and Mrs. Meow. What are the names of the users that they are being fed by?

SELECT name, <-feeds<-user.name FROM cat;
Response
[ { "<-feeds": { "<-user": { name: [ 'Aeon' ] } }, name: 'Mr. Meow' }, { "<-feeds": { "<-user": { name: [ 'Aeon' ] } }, name: 'Mrs. Meow' } ]

This shows again that queries with arrows moving in the same direction next to a graph name are generally the ones you want. In our case, ->feeds-> and <-feeds<- are most useful.

It is much rarer that you will want to use the directions <-graph_name-> or ->graph_name<-, because the first accesses both out fields and the second accesses both in fields. But there is still some use to these two possible directions. Let’s give them a try as we think about what exactly a query of that form would mean.

This first query starts at the out field of feeds and moves on to the same out field, both of which are cats.

SELECT <-feeds->cat FROM cat;
Mr. Meow and Mrs. Meow are each fed once
[ { "<-feeds": { "->cat": [ cat:one ] } }, { "<-feeds": { "->cat": [ cat:two ] } } ]

So that gave us the names of cats that are each fed by…something (not sure what).

This next query starts at the in field of feeds and moves on to the same in field, both of which are users.

SELECT ->feeds<-user.name FROM user;
Aeon is feeding something twice
[ { "->feeds": { "<-user": [ user:aeon, user:aeon ] } } ]

So that gave us the name of a user that feeds…something (not sure what).

In other words, a <-relation_name-> can show how passively involved a record is in some relation, while ->relation_name<- can show how actively involved a record is in some relation.

Destructuring inside relation queries

The destructuring syntax that we first saw way back in Chapter 1 is particularly helpful when making a graph query, since there can be a lot of typing involved before you reach the end of a query. This syntax was added in SurrealDB 2.0. Before version 2.0, you used to have to use the arrow syntax each time you wanted to access a field at the end of a graph query:

SELECT name, ->feeds->cat.name, ->feeds->cat.id FROM user;

But now this only takes a single line!

SELECT name, ->feeds->cat.{ id, name } FROM user;
Response
[ { "->feeds": { "->cat": [ { id: cat:two, name: 'Mrs. Meow' }, { id: cat:one, name: 'Mr. Meow' } ] }, name: 'Aeon' } ]

There is a lot more you can do in SurrealDB with graph edges, because graph queries can continue as far as your imagination takes you. For example, what if you wanted to see the cats fed by a person…who is the parent of another person? Or in more human language: “Who are the cats fed by so-and-so’s parents?”

A graph traversal can do all of that too! But we have learned enough for this chapter and will save those fancy queries for the next one. See you in the next chapter!

Next: Chapter 7…

Hold on a second, can you hear that sound?

Aeon’s day isn’t over yet. Let’s tune in to see what’s going on.

Later that day

The same rolling landscape as seen in the beginning of the chapter, but zoomed out a bit. A group of riders armed with spears can be seen on the top of the nearest hill, and there is now a tension in the air.

You hear the sound of horses’ hooves in the distance, and make your way to the tunnel entrance. It sounds like a dozen or more of them. Why so many?

All you can see at first is a cloud of dust, but soon the riders reach the top of the nearest hill and are close enough to see. At their head is Landevin, one of the young members of the town council and famous for being a brilliant speaker…and for his fancy name. Who gives their child a name like Landevin, anyway?

Landevin stops his horse and looks down towards you.

“Aeon, we’ve come from the town to have a serious talk with you about your project.”

Practice time
1. Your database contains characters in a game and their items. Should you use RELATE to join each character to its items?
Answer

This will depend on the context and how you want to structure your data, but it’s possible that simple record links could be the solution. It also feels right to give record links a bit of attention once again after a full chapter on nothing but graph relations.

You could create items that are accessed via a field on a character table:

CREATE character:rondo SET name = "Rondo the Magnificent"; CREATE item:1 SET type = "weapon", damage = 10; CREATE item:2 SET type = "book", content = "Hail traveller and welcome to the..."; UPDATE character:rondo SET items += item:1; UPDATE character:rondo SET items += item:2; // Then access the items using FETCH or .*.* SELECT * FROM character:rondo FETCH items; SELECT *, items.*.* FROM character:rondo;

Or you could also assign a single owner to each item and query them inside a subquery.

CREATE character:rondo SET name = "Rondo the Magnificent"; CREATE item:1 SET type = "weapon", damage = 10, owner = character:rondo; CREATE item:2 SET type = "book", content = "Hail traveller and welcome to the...", owner = character:rondo; SELECT *, (SELECT * FROM item WHERE owner = character:rondo) AS items FROM character:rondo;

Answer

As the example above shows, you can always style a query by adding a subquery if you want to emulate a bidirectional relation. So wanting to query both a character and an item at the same time doesn’t absolutely mean turning to a record link.

SELECT *, (SELECT * FROM item WHERE owner = character:rondo) AS items FROM character:rondo;

However, a record link does become the right choice if you want to add information about the relationship itself. This example shows a relation that includes information on when and where an item was picked up, which is very easy to do using a relation.

CREATE character:rondo SET name = "Rondo the Magnificent"; CREATE item:1 SET type = "weapon", damage = 10; LET $current_location = { x: 50, y: 30 }; RELATE character:rondo->owns->item:1 SET since = time::now(), picked_up_at = $current_location; SELECT <-owns.* AS owner_details FROM ONLY item:1;
Response
{ owner_details: [ { id: owns:gex77koyxdnb5vbuoo2m, in: character:rondo, out: item:1, picked_up_at: { x: 50, y: 30 }, since: d'2024-07-29T02:05:40.716Z' } ] }

3. How would you create a relation called defeated that keeps track of historical battles?
Answer

The data below showing a few Roman victories and losses shows how you can do it. Since we are using the RELATE statement to create relations, we can also keep track of these battles such as the date when they happened.

CREATE country:rome SET name = "Rome"; CREATE country:etruria SET name = "Etruria"; CREATE country:senones SET name = "Senones"; RELATE country:rome->defeated->country:etruria SET defeated_at = <datetime>"-0264-01-01"; RELATE country:senones->defeated->country:rome SET defeated_at = <datetime>"-0387-07-18"; RELATE country:rome->defeated->country:carthage SET defeated_at = <datetime>"-0146-01-01";

4. How would you query that data on historical battles?
Answer

The sky is the limit when it comes to these types of queries, but here are two to get you started. You can query starting from country or from the defeated table itself.

SELECT name, ->defeated->country AS won_against, <-defeated<-country AS lost_to FROM country;
Response
[ { lost_to: [ country:rome ], name: 'Carthage', won_against: [] }, { lost_to: [ country:rome ], name: 'Etruria', won_against: [] }, { lost_to: [ country:senones ], name: 'Rome', won_against: [ country:etruria, country:carthage ] }, { lost_to: [], name: 'Senones', won_against: [ country:rome ] } ]
SELECT defeated_at, in.name AS winner, out.name AS loser FROM defeated;
Response
[ { defeated_at: '-0264-01-01T00:00:00Z', loser: 'Etruria', winner: 'Rome' }, { defeated_at: '-0146-01-01T00:00:00Z', loser: 'Carthage', winner: 'Rome' }, { defeated_at: '-0387-07-18T00:00:00Z', loser: 'Rome', winner: 'Senones' } ]

5. How could you find all the countries that have won at least 50% of their battles?
Answer

One way to do this is to use the count() function to compare the number of a country’s victories versus the number of its defeats. As ->defeated will return an array of the victories and <-defeated will return the opposite, these can be fed into the count() function.

SELECT * FROM country WHERE count(->defeated) >= count(<-defeated);
Response
[ { id: country:rome, name: 'Rome' }, { id: country:senones, name: 'Senones' } ]