SurrealDB University is LIVE! Master the future of data

Chapter 8: A nice surprise Chapter 8: A nice surprise
Back to Courses

Chapter 8: A nice surprise

Time elapsed: 2w6d

A few days have passed since the incident with Landevin. You are starting to get the hang of graph querying and are excited at the possibilities it might hold. It feels almost organic to have records joined to each other by relations. Just like people in real life! Technology isn’t so scary if you remember that it is meant to enhance life, not replace it.

Later in the day you receive a message that there is a visitor at the door. It’s Landevin! And he’s alone this time.

You leave the tunnel to speak with him, and immediately feel ashamed as you notice the dozen guards standing between the two of you. You quickly wave the guards away and apologize.

Surprisingly, he smiles. “Think nothing of it, Aeon. We only wanted to raise a few issues, and do not oppose what you are doing here. After returning to Toria, I saw that my speeches had caused too much fear in the people, and felt ashamed. If I had been even more outspoken, they might have taken matters into their own hands and destroyed all the good work you’ve done. I’ve convinced them that what you are doing is good, and that we just need to be careful. Also…you see…”

“Yes?”

“Well, I can’t stop thinking about the knowledge inside the library and what it could do for the world. In fact, I’d like to visit every once in a while and learn the arts of the database. Can you teach me?”

Longer relational queries

There is a little bit left in the story of the enchanted knights that will involve some even more complex relationships, and here is the current dataset as of the end of the last chapter in case you need it.

To get ready for the extra data from the rest of the story, let’s learn how to use even longer graph traversal queries than the ones we used in the last chapter.

The next query is twice as long as the graph queries we have done so far. Instead of following the ->parent-of-person path, it follows the same path again! What do you think it will return?

SELECT ->parent_of->person ->parent_of->person FROM person;

Let’s think it through as in the previous chapters.

  • FROM person: starts with person records,
  • ->parent_of: looks for the person at in of parent_of. That’s the parent.
  • ->person: of a person record that is out. That’s person who “is parented” by another person…
  • ->parent_of: who is also a parent_of…
  • ->person: another person!

In other words, it’s looking for a grandparent: a person who is a parent_of someone else who is a parent_of. SurrealDB will go through each of our person records to see if there is a match, and there isn’t a single one.

Response
[ { "->parent_of": { "->person": { "->parent_of": { "->person": [] } } } }, { "->parent_of": { "->person": { "->parent_of": { "->person": [] } } } }, // ... and so on ... ]

We can prove that this query is actually doing what we expect it to do by creating a person who is a grandparent. There is no mention of the Nobleman’s mother anywhere in the story of the enchanted knights, but he certainly did have a mother. So let’s create and then RELATE her as the parent_of the Nobleman.

CREATE person:grandmother SET name = "Mother of the Nobleman", feeling = "Disappointed in son"; RELATE person:grandmother->parent_of->person:the_nobleman;

While are at it, let’s add an alias to this graph traversal to call it what it is: grandchildren.

SELECT name, ->parent_of->person ->parent_of->person AS grandchildren FROM person;

The query now returns some grandchildren! You should see them inside the only person record that is a grandparent, and an empty array for the rest.

Response
[ { grandchildren: [ person:bertha, person:adelaide, person:wulfield ], name: 'Mother of the Nobleman' }, { grandchildren: [], name: 'The Nobleman' }, { grandchildren: [], name: 'Wulfield' } ... and so on ... ]

So that means that a query to return both children and grandchildren would look like this.

SELECT name, ->parent_of->person AS children, ->parent_of->person->parent_of->person AS grandchildren FROM person;

This is starting to get easy, so let’s try something harder: a combination of parent_of and married. What do you think this query will return?

SELECT name, ->parent_of->person<->married<->person FROM person;

Again, let’s look at it one step at a time.

  • FROM person: starting from person records…
  • ->parent_of: who are parents…
  • ->person: of a person
  • <->married: who is married (i.e. either an in or out in a married graph)
  • <->person: to a person (i.e. either an in or out in a married graph)

So that would be…a person’s married children and their spouses. Let’s give this the alias kids_and_spouses for readability.

SELECT name, ->parent_of->person<->married<->person AS kids_and_spouses FROM person;
Response
[ { kids_and_spouses: [], name: 'Adelaide' }, { kids_and_spouses: [], name: 'Bear knight' }, { kids_and_spouses: [], name: 'Bertha' }, { kids_and_spouses: [], name: 'Eagle knight' }, { kids_and_spouses: [], name: 'Fish knight' }, { kids_and_spouses: [ person:the_nobleman, person:the_noblewoman ], name: 'Mother of the Nobleman' }, { kids_and_spouses: [ person:bertha, person:fish_knight, person:eagle_knight, person:adelaide, person:wulfield, person:bear_knight ], name: 'The Nobleman' }, { kids_and_spouses: [ person:eagle_knight, person:adelaide, person:bertha, person:fish_knight, person:wulfield, person:bear_knight ], name: 'The Noblewoman' }, { kids_and_spouses: [], name: 'Wulfield' } ]

But we’re not done yet! If there are children that are married, how about those children too? We’ll stick that in and call it grandkids_and_spouses.

SELECT name, ->parent_of->person<->married<->person AS kids_and_spouses, ->parent_of->person->parent_of->person<->married<->person AS grandkids_and_spouses FROM person;

The query result shows all three generations. The mother of the Nobleman has results inside both grandkids_and_spouses and kids_and_spouses, the Nobleman and his wife inside kids_and_spouses, while the rest of the characters don’t have any children.

Response
[ { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Adelaide' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Bear knight' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Bertha' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Eagle knight' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Fish knight' }, { grandkids_and_spouses: [ person:bertha, person:fish_knight, person:eagle_knight, person:adelaide, person:wulfield, person:bear_knight ], kids_and_spouses: [ person:the_nobleman, person:the_noblewoman ], name: 'Mother of the Nobleman' }, { grandkids_and_spouses: [], kids_and_spouses: [ person:bertha, person:fish_knight, person:eagle_knight, person:adelaide, person:wulfield, person:bear_knight ], name: 'The Nobleman' }, { grandkids_and_spouses: [], kids_and_spouses: [ person:eagle_knight, person:adelaide, person:bertha, person:fish_knight, person:wulfield, person:bear_knight ], name: 'The Noblewoman' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Wulfield' } ]

With that practice under our belt, let’s get back to the story to see how it ends.

The end of the story and some more RELATE practice

Finally there was some good news for the Nobleman and his wife, as they had another child: a son named Reginald. Reginald grew up to become a knight and set out to find his sisters.

Reginald discovered that an evil sorcerer had cursed their three husbands, who by now had children with their wives. The curse is why they transformed into angry beasts every once in a while.

Every time they were about to transform, they had to leave the castle and lock the doors so that they couldn’t get back in and hurt their family. They remembered who they were as beasts, but couldn’t stop themselves from trying to find and hurt their own family. What a sad life they led…

The rest of the story is too long to tell, but here is the short version.

Reginald eventually found the sorcerer’s castle and discovered that the sorcerer had already died in battle a few years before. Inside the castle was an object that held the curse of the three knights, which Reginald destroyed. The knights were freed, and everybody lived happily ever after.

Wonderful! This is a relief, as many traditional German tales have some pretty tragic endings. But not this one.

To finish up our practice on RELATE in this chapter, here are some more things to know.

RELATE works on multiple records, but each relationship is individual

Multiple records can be related to each other in a single RELATE statement. That means that we can set parent_of or enemy_of for all of the person records in our story fairly quickly. This next example creates Reginald, along with the Sorcerer named Zornebock, who is everybody’s enemy.

CREATE person:reginald SET name = "Reginald"; RELATE [person:the_nobleman, person:the_noblewoman]->parent_of->person:reginald; CREATE person:sorcerer SET name = "Zornebock"; RELATE [ person:reginald, person:bear_knight, person:eagle_knight, person:fish_knight ]->enemy_of->person:sorcerer;

The query output shows this as well. Note the output for the enemy_of relation that we created with a single statement, in which out is always the same: person:sorcerer. This makes sense, as he is everybody’s enemy.

Response
[ { id: enemy_of:6lhupa0wf7b04lg76tf6, in: person:reginald, out: person:sorcerer }, { id: enemy_of:eqh0nq377cchsedacm6v, in: person:bear_knight, out: person:sorcerer }, { id: enemy_of:w7ec71o12wpysygg1l89, in: person:eagle_knight, out: person:sorcerer }, { id: enemy_of:ftlsang4teidppytx8rk, in: person:fish_knight, out: person:sorcerer } ]

Multiplication of arrays in RELATE and records that don’t exist yet

Let’s imagine that there are a few more knights in the next town, but they have two sorcerer enemies instead of one. For simplicity’s sake, we’ll move away from the person record (because it has a defined schema) and just call them knight one and two, and sorcerer one and two.

What do you think the output of this RELATE statement will be?

RELATE [knight:one, knight:two]->enemy_of->[sorcerer:one, sorcerer:two];

The output is: four enemy_of tables!

Response
[ { id: enemy_of:ygldlu0xtj7diywntab4, in: knight:one, out: sorcerer:one }, { id: enemy_of:7w8zti2nzmv4a2xr3i5x, in: knight:one, out: sorcerer:two }, { id: enemy_of:9b78wjxh6o832yb2qtmw, in: knight:two, out: sorcerer:one }, { id: enemy_of:rzpn40icvhofyltw3fq6, in: knight:two, out: sorcerer:two } ]

The first interesting part about this output is that a RELATE statement will return a number of records equal to the length of one side multiplied by the other. In this case, that means four records.

But the more interesting part might be that it worked in the first place. After all, we don’t even have any knight or sorcerer records in the database. But no matter: a graph table is independent of the existence of records at the in and out position.

At this point, a query to see the enemies of the knights will return nothing because there are no knight records in the database.

SELECT ->enemy_of->sorcerer AS enemies FROM knight;
Response
[]

Now let’s see what the output is after we create the two knights.

CREATE knight:one, knight:two;
SELECT ->enemy_of->sorcerer from knight;

This time SurrealDB picks up that they are related to enemy_of and looks for a sorcerer record, but doesn’t see anything at the out part of the graph.

Response
[ { "->enemy_of": { "->sorcerer": [] } }, { "->enemy_of": { "->sorcerer": [] } } ]

Finally, if we CREATE sorcerer:one, sorcerer:two, the query will now return them.

CREATE sorcerer:one, sorcerer:two;
SELECT ->enemy_of->sorcerer FROM knight;
Response
[ { "->enemy_of": { "->sorcerer": [ sorcerer:two, sorcerer:one ] } }, { "->enemy_of": { "->sorcerer": [ sorcerer:one, sorcerer:two ] } } ]

One more challenge

For the last challenge in this chapter, let’s try to come up with a mind-bendingly long query just for the sake of practice. This next query will return a result in some cases, and not in others. What do you think it will return?

SELECT name, <-parent_of<-person <-parent_of<-person ->parent_of->person ->parent_of->person <->enemy_of<->person FROM person;

Let’s parse this one step at a time as we always do.

  • FROM person: starting from a person record
  • <-parent_of<-person: who is a child of a person (from child to parent)
  • <-parent_of<-person: who is a child of yet another person (from child to parent again)
  • ->parent_of->person: who is the parent of a person (from parent to child)
  • ->parent_of->person: who is a parent of yet another person (from parent to child again)
  • <->enemy_of<->person: Finally, who are the people in an enemy relationship from that person.

The result is all the records somehow involved in an enemy relation that are grandchildren of your grandparents - possibly including you.

Let’s give this incredibly large line an alias called grandparent_to_grandchild_enemies to make it readable. Here is the output:

Response
[ { grandparent_to_grandchild_enemies: [ person:reginald, person:sorcerer ], name: 'Adelaide' }, { grandparent_to_grandchild_enemies: [], name: 'Bear knight' }, { grandparent_to_grandchild_enemies: [ person:reginald, person:sorcerer ], name: 'Bertha' }, { grandparent_to_grandchild_enemies: [], name: 'Eagle knight' }, { grandparent_to_grandchild_enemies: [], name: 'Fish knight' }, { grandparent_to_grandchild_enemies: [], name: 'Mother of the Nobleman' }, { grandparent_to_grandchild_enemies: [ person:reginald, person:sorcerer ], name: 'Reginald' }, // ... and so on ... ]

We see a response for Adelaide, because she has grandparents who have grandchildren who have an enemy. The same goes for the other daughters as well as Reginald. The knights do not show anything because, although they are enemies of the sorcerer, they are not the grandchildren of anyone in particular in our database.

SCHEMAFULL graph tables and indexes

We learned how to create a schema for our record types a few chapters ago, and graph tables can be made SCHEMAFULL in the same way too. And without a schema, you can do anything you like.

That means that not only can we marry the Nobleman to more than individual, his new spouses don’t even have to be people!

CREATE cat:mr_meow, cat:mrs_meow; RELATE [cat:mr_meow, cat:mrs_meow]->married->person:the_nobleman; SELECT in, out FROM married WHERE in = person:the_nobleman OR out = person:the_nobleman;
Response
[ { in: cat:mrs_meow, out: person:the_nobleman }, { in: person:the_nobleman, out: person:the_noblewoman }, { in: cat:mr_meow, out: person:the_nobleman } ]

Another interesting behaviour that we’d like to prevent is the fact that relations can be created in the same way as many times as we like, even for the same records:

CREATE dog:mr_bark; CREATE food:biscuit; RELATE dog:mr_bark->likes->food:biscuit; RELATE dog:mr_bark->likes->food:biscuit; RELATE dog:mr_bark->likes->food:biscuit; SELECT ->likes->food FROM dog;

When you query for Mr. Bark’s favourite food, the answer should be “biscuits”, not “biscuits biscuits biscuits”. That is exactly what the output below shows: three results when there should only be one.

Response
[ { "->likes": { "->food": [ food:biscuit, food:biscuit, food:biscuit ] } } ]

We can define such tables to ensure that in and out must be unique, thus only having a single record ID in them.

You might think that DEFINE FIELD is the way to do it using a statement like the following:

DEFINE FIELD in ON TABLE married UNIQUE;

But it’s actually done through a statement called DEFINE INDEX! This is because uniqueness for databases is most easily done through the creation of an index, as opposed to a full table scan which would be much less efficient. Indexes can be created for other reasons such as improved performance and don’t need to be set as UNIQUE, but at the moment we are focusing on creating a unique constraint so we will leave the discussion of indexes as a whole to another day.

In any case, the syntax starts in this way:

DEFINE INDEX index_name ON TABLE table_name FIELDS [field names]

We have the choice to add the UNIQUE keyword at this point, which is what we want. You will probably find yourself using UNIQUE frequently for fields related to emails, addresses, IDs, names, and anything else that only one record should have.

All together, our DEFINE INDEX statement looks like this.

DEFINE INDEX only_unique ON TABLE likes FIELDS in, out UNIQUE;

When we execute this statement, SurrealDB returns an error - which is a good thing!

Response
'Database index `only_unique` already contains [dog:mr_bark, food:biscuit], with record `likes:4v7y0y9424nysk5paagp`'

That is because the likes table already contains more than one instance of dog:mr_bark as in and food_biscuit at out. We can SELECT * FROM likes LIMIT 2 to find two of the existing record IDs and give that to a DELETE statement to get rid of them.

DELETE (SELECT * FROM likes LIMIT 2) RETURN BEFORE;

Now that the extra likes tables are deleted, we can define the index.

DEFINE INDEX only_unique ON TABLE likes FIELDS in, out UNIQUE;

Having done so, it is now impossible for us to specify that Mr. Bark likes biscuits more than once.

RELATE dog:mr_bark->likes->food:biscuit;
Response
'Database index `only_unique` already contains [dog:mr_bark, food:biscuit], with record `likes:4u0yky84lrspxwv35pre`'

Using a new field as a unique constraint

Now what if we want to define a unique index on a relation of two equals, like in a friends_with or married_to table? In those relationships, it doesn’t matter who is in or out, because friendship isn’t about “befriending A” from one side and “being befriended by B” on the other.

We can start by defining a unique index on the in and out fields as before:

DEFINE INDEX only_unique ON TABLE friends_with FIELDS in, out UNIQUE;

However, because we are never certain who is in or out in a friendship, that means that we can befriend the same two records twice through friends_with by switching the order around! It won’t be until the third RELATE statement that it will violate the unique index that we defined.

RELATE cat:one->friends_with->cat:two; RELATE cat:two->friends_with->cat:one; RELATE cat:one->friends_with->cat:two;
Response
-------- Query -------- [ { id: friends_with:nsjdgitzyg5tw6df1f9q, in: cat:one, out: cat:two } ] -------- Query -------- [ { id: friends_with:xmk8xkngz3mt9vpa87wd, in: cat:two, out: cat:one } ] -------- Query -------- 'Database index `only_unique` already contains [cat:one, cat:two], with record `friends_with:nsjdgitzyg5tw6df1f9q`'

Fortunately, there is a fairly straightforward way to solve this problem. Since graph tables can contain their own fields, we can add a field (we’ll call it key) that holds both in and out, but sorted through the array::sort() method. Since we are only using this field to check for uniqueness, we could also cast into a string which is easier to read than an array. Here is a quick demonstration of the output from this method.

RETURN [<string>[cat:one, cat:two].sort(), <string>[cat:two, cat:one].sort()];
Response
[ '[cat:one, cat:two]', '[cat:one, cat:two]' ]

So all we have to do is define the field, and then add an index ta ensures that this field must be unique. Note that there is no need to specify the type because it is generated automatically, but you could add TYPE option<string> if you like.

DEFINE FIELD key ON TABLE friends_with VALUE <string>[in, out].sort();
DEFINE INDEX only_unique ON TABLE friends_with FIELDS key UNIQUE;

With this unique key set up, we are now unable to create weird double friendships when a friendship has already been formed.

RELATE cat:one->friends_with->cat:two;
RELATE cat:two->friends_with->cat:one;
Response
-------- Query -------- [ { id: friends_with:g85n08h0kpwrnf6n3sui, in: cat:one, key: '[cat:one, cat:two]', out: cat:two } ] -------- Query -------- "Database index `only_unique` already contains '[cat:one, cat:two]', with record `friends_with:g85n08h0kpwrnf6n3sui`"

Well done! Here is the full dataset for your reference.

This concludes our first mini-project. Hopefully SurrealDB’s -> and <- syntax has become familiar over the last four chapters. If not, don’t worry: we still haven’t even approached the halfway point of the book and there are many similar queries to come.

Practice time
1. How would you make sure that an email can only be used by one user?
Answer

An index with UNIQUE will do the trick. You’ll probably also want to make sure that the field is a string.

DEFINE FIELD email ON TABLE user TYPE string;
DEFINE INDEX only_unique_emails ON TABLE user FIELDS email UNIQUE;

2. But how would you make sure that each user’s email is valid?

Hint: See if you can find a function that can make this work.

Answer

The function to make this assertion is found inside SurrealDB’s string functions. Using one of them called string::is::email() will do the trick:

DEFINE FIELD email ON TABLE user TYPE string ASSERT $value.is_email();

3. How would you find all the moons that orbit all planets that orbit a star called The Sun?

Here’s some sample data to get you started.

CREATE star SET name = "The Sun"; CREATE planet SET name = "Saturn"; CREATE moon SET name = "Titan"; CREATE moon SET name = "Mimas"; RELATE (SELECT * FROM planet WHERE name = "Saturn")->orbits->(SELECT * FROM star WHERE name = "The Sun"); RELATE (SELECT * FROM moon WHERE name IN ["Titan", "Mimas"])->orbits->(SELECT * FROM planet WHERE name = "Saturn"); CREATE star SET name = "Proxima Centauri"; CREATE planet SET name = "Proxima Centauri b"; CREATE moon SET name = "Distant moon"; RELATE (SELECT * FROM planet WHERE name = "Proxima Centauri b")->orbits->(SELECT * FROM star WHERE name = "Proxima Centauri"); RELATE (SELECT * FROM moon WHERE name = "Distant moon")->orbits->(SELECT * FROM planet WHERE name = "Proxima Centauri b");
Answer

One way to do it is to start at the moon records, follow the path to see what planets they orbit, and then follow the path once more to see what stars the planets orbit. You’ll now find yourself at the star records, after which you can filter by name.

SELECT * FROM moon WHERE "The Sun" IN ->orbits->planet->orbits->star.name;
SELECT * FROM moon WHERE "Proxima Centauri" IN ->orbits->planet->orbits->star.name;

4. How would you find all the planets with two or more moons that orbit the Sun?
Answer

The count() function and two graph queries will do the trick here. One uses <- to see what the planet is being orbited by, and the other uses -> to see what it is orbiting.

SELECT * FROM planet WHERE count(<-orbits) > 1 AND "The Sun" in ->orbits->star.name;

5. How would you return all the planets and moons around each star?
Answer

You could do it with the following query that goes down one level, and then goes down two levels.

SELECT *, <-orbits<-planet.name, <-orbits<-planet<-orbits<-moon.name FROM star;

Note that you could use an alias here, but not using an alias will combine the structure of the two graph queries and make it clear which moons are orbiting which planets.

SELECT *, <-orbits<-planet.name AS planet_names, <-orbits<-planet<-orbits<-moon.name AS moon_names FROM star;