Time elapsed: 2w6d
A few days have passed since the incident with Landevin. You are starting to get the hang of graph querying and are excited at the possibilities it might hold. It feels almost organic to have records joined to each other by relations. Just like people in real life! Technology isn’t so scary if you remember that it is meant to enhance life, not replace it.
Later in the day you receive a message that there is a visitor at the door. It’s Landevin! And he’s alone this time.
You leave the tunnel to speak with him, and immediately feel ashamed as you notice the dozen guards standing between the two of you. You quickly wave the guards away and apologize.
Surprisingly, he smiles. “Think nothing of it, Aeon. We only wanted to raise a few issues, and do not oppose what you are doing here. After returning to Toria, I saw that my speeches had caused too much fear in the people, and felt ashamed. If I had been even more outspoken, they might have taken matters into their own hands and destroyed all the good work you’ve done. I’ve convinced them that what you are doing is good, and that we just need to be careful. Also…you see…”
“Yes?”
“Well, I can’t stop thinking about the knowledge inside the library and what it could do for the world. In fact, I’d like to visit every once in a while and learn the arts of the database. Can you teach me?”
There is a little bit left in the story of the enchanted knights that will involve some even more complex relationships, and here is the current dataset as of the end of the last chapter in case you need it.
To get ready for the extra data from the rest of the story, let’s learn how to use even longer graph traversal queries than the ones we used in the last chapter.
The next query is twice as long as the graph queries we have done so far. Instead of following the ->parent-of-person
path, it follows the same path again! What do you think it will return?
SELECT ->parent_of->person ->parent_of->person FROM person;
Let’s think it through as in the previous chapters.
person
records,in
of parent_of
. That’s the parent.person
record that is out
. That’s person
who “is parented” by another person…In other words, it’s looking for a grandparent: a person who is a parent_of
someone else who is a parent_of
. SurrealDB will go through each of our person
records to see if there is a match, and there isn’t a single one.
Response[ { "->parent_of": { "->person": { "->parent_of": { "->person": [] } } } }, { "->parent_of": { "->person": { "->parent_of": { "->person": [] } } } }, // ... and so on ... ]
We can prove that this query is actually doing what we expect it to do by creating a person
who is a grandparent. There is no mention of the Nobleman’s mother anywhere in the story of the enchanted knights, but he certainly did have a mother. So let’s create and then RELATE
her as the parent_of
the Nobleman.
CREATE person:grandmother SET name = "Mother of the Nobleman", feeling = "Disappointed in son"; RELATE person:grandmother->parent_of->person:the_nobleman;
While are at it, let’s add an alias to this graph traversal to call it what it is: grandchildren.
SELECT name, ->parent_of->person ->parent_of->person AS grandchildren FROM person;
The query now returns some grandchildren! You should see them inside the only person
record that is a grandparent, and an empty array for the rest.
Response[ { grandchildren: [ person:bertha, person:adelaide, person:wulfield ], name: 'Mother of the Nobleman' }, { grandchildren: [], name: 'The Nobleman' }, { grandchildren: [], name: 'Wulfield' } ... and so on ... ]
So that means that a query to return both children and grandchildren would look like this.
SELECT name, ->parent_of->person AS children, ->parent_of->person->parent_of->person AS grandchildren FROM person;
This is starting to get easy, so let’s try something harder: a combination of parent_of
and married
. What do you think this query will return?
SELECT name, ->parent_of->person<->married<->person FROM person;
Again, let’s look at it one step at a time.
->parent_of
: who are parents…->person
: of a person
<->married
: who is married (i.e. either an in
or out
in a married
graph)<->person
: to a person (i.e. either an in
or out
in a married
graph)So that would be…a person’s married children and their spouses. Let’s give this the alias kids_and_spouses
for readability.
SELECT name, ->parent_of->person<->married<->person AS kids_and_spouses FROM person;
Response[ { kids_and_spouses: [], name: 'Adelaide' }, { kids_and_spouses: [], name: 'Bear knight' }, { kids_and_spouses: [], name: 'Bertha' }, { kids_and_spouses: [], name: 'Eagle knight' }, { kids_and_spouses: [], name: 'Fish knight' }, { kids_and_spouses: [ person:the_nobleman, person:the_noblewoman ], name: 'Mother of the Nobleman' }, { kids_and_spouses: [ person:bertha, person:fish_knight, person:eagle_knight, person:adelaide, person:wulfield, person:bear_knight ], name: 'The Nobleman' }, { kids_and_spouses: [ person:eagle_knight, person:adelaide, person:bertha, person:fish_knight, person:wulfield, person:bear_knight ], name: 'The Noblewoman' }, { kids_and_spouses: [], name: 'Wulfield' } ]
But we’re not done yet! If there are children that are married, how about those children too? We’ll stick that in and call it grandkids_and_spouses
.
SELECT name, ->parent_of->person<->married<->person AS kids_and_spouses, ->parent_of->person->parent_of->person<->married<->person AS grandkids_and_spouses FROM person;
The query result shows all three generations. The mother of the Nobleman has results inside both grandkids_and_spouses
and kids_and_spouses
, the Nobleman and his wife inside kids_and_spouses
, while the rest of the characters don’t have any children.
Response[ { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Adelaide' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Bear knight' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Bertha' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Eagle knight' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Fish knight' }, { grandkids_and_spouses: [ person:bertha, person:fish_knight, person:eagle_knight, person:adelaide, person:wulfield, person:bear_knight ], kids_and_spouses: [ person:the_nobleman, person:the_noblewoman ], name: 'Mother of the Nobleman' }, { grandkids_and_spouses: [], kids_and_spouses: [ person:bertha, person:fish_knight, person:eagle_knight, person:adelaide, person:wulfield, person:bear_knight ], name: 'The Nobleman' }, { grandkids_and_spouses: [], kids_and_spouses: [ person:eagle_knight, person:adelaide, person:bertha, person:fish_knight, person:wulfield, person:bear_knight ], name: 'The Noblewoman' }, { grandkids_and_spouses: [], kids_and_spouses: [], name: 'Wulfield' } ]
With that practice under our belt, let’s get back to the story to see how it ends.
Finally there was some good news for the Nobleman and his wife, as they had another child: a son named Reginald. Reginald grew up to become a knight and set out to find his sisters.
Reginald discovered that an evil sorcerer had cursed their three husbands, who by now had children with their wives. The curse is why they transformed into angry beasts every once in a while.
Every time they were about to transform, they had to leave the castle and lock the doors so that they couldn’t get back in and hurt their family. They remembered who they were as beasts, but couldn’t stop themselves from trying to find and hurt their own family. What a sad life they led…
The rest of the story is too long to tell, but here is the short version.
Reginald eventually found the sorcerer’s castle and discovered that the sorcerer had already died in battle a few years before. Inside the castle was an object that held the curse of the three knights, which Reginald destroyed. The knights were freed, and everybody lived happily ever after.
Wonderful! This is a relief, as many traditional German tales have some pretty tragic endings. But not this one.
To finish up our practice on RELATE
in this chapter, here are some more things to know.
Multiple records can be related to each other in a single RELATE
statement. That means that we can set parent_of
or enemy_of
for all of the person
records in our story fairly quickly. This next example creates Reginald, along with the Sorcerer named Zornebock, who is everybody’s enemy.
CREATE person:reginald SET name = "Reginald"; RELATE [person:the_nobleman, person:the_noblewoman]->parent_of->person:reginald; CREATE person:sorcerer SET name = "Zornebock"; RELATE [ person:reginald, person:bear_knight, person:eagle_knight, person:fish_knight ]->enemy_of->person:sorcerer;
The query output shows this as well. Note the output for the enemy_of
relation that we created with a single statement, in which out
is always the same: person:sorcerer
. This makes sense, as he is everybody’s enemy.
Response[ { id: enemy_of:6lhupa0wf7b04lg76tf6, in: person:reginald, out: person:sorcerer }, { id: enemy_of:eqh0nq377cchsedacm6v, in: person:bear_knight, out: person:sorcerer }, { id: enemy_of:w7ec71o12wpysygg1l89, in: person:eagle_knight, out: person:sorcerer }, { id: enemy_of:ftlsang4teidppytx8rk, in: person:fish_knight, out: person:sorcerer } ]
Let’s imagine that there are a few more knights in the next town, but they have two sorcerer enemies instead of one. For simplicity’s sake, we’ll move away from the person
record (because it has a defined schema) and just call them knight
one and two, and sorcerer
one and two.
What do you think the output of this RELATE
statement will be?
RELATE [knight:one, knight:two]->enemy_of->[sorcerer:one, sorcerer:two];
The output is: four enemy_of
tables!
Response[ { id: enemy_of:ygldlu0xtj7diywntab4, in: knight:one, out: sorcerer:one }, { id: enemy_of:7w8zti2nzmv4a2xr3i5x, in: knight:one, out: sorcerer:two }, { id: enemy_of:9b78wjxh6o832yb2qtmw, in: knight:two, out: sorcerer:one }, { id: enemy_of:rzpn40icvhofyltw3fq6, in: knight:two, out: sorcerer:two } ]
The first interesting part about this output is that a RELATE
statement will return a number of records equal to the length of one side multiplied by the other. In this case, that means four records.
But the more interesting part might be that it worked in the first place. After all, we don’t even have any knight
or sorcerer
records in the database. But no matter: a graph table is independent of the existence of records at the in
and out
position.
At this point, a query to see the enemies of the knights will return nothing because there are no knight
records in the database.
SELECT ->enemy_of->sorcerer AS enemies FROM knight;
Response[]
Now let’s see what the output is after we create the two knights.
CREATE knight:one, knight:two; SELECT ->enemy_of->sorcerer from knight;
This time SurrealDB picks up that they are related to enemy_of
and looks for a sorcerer
record, but doesn’t see anything at the out
part of the graph.
Response[ { "->enemy_of": { "->sorcerer": [] } }, { "->enemy_of": { "->sorcerer": [] } } ]
Finally, if we CREATE sorcerer:one, sorcerer:two
, the query will now return them.
CREATE sorcerer:one, sorcerer:two; SELECT ->enemy_of->sorcerer FROM knight;
Response[ { "->enemy_of": { "->sorcerer": [ sorcerer:two, sorcerer:one ] } }, { "->enemy_of": { "->sorcerer": [ sorcerer:one, sorcerer:two ] } } ]
For the last challenge in this chapter, let’s try to come up with a mind-bendingly long query just for the sake of practice. This next query will return a result in some cases, and not in others. What do you think it will return?
SELECT name, <-parent_of<-person <-parent_of<-person ->parent_of->person ->parent_of->person <->enemy_of<->person FROM person;
Let’s parse this one step at a time as we always do.
person
record<-parent_of<-person
: who is a child of a person (from child to parent)<-parent_of<-person
: who is a child of yet another person (from child to parent again)->parent_of->person
: who is the parent of a person (from parent to child)->parent_of->person
: who is a parent of yet another person (from parent to child again)<->enemy_of<->person
: Finally, who are the people in an enemy relationship from that person.The result is all the records somehow involved in an enemy relation that are grandchildren of your grandparents - possibly including you.
Let’s give this incredibly large line an alias called grandparent_to_grandchild_enemies
to make it readable. Here is the output:
Response[ { grandparent_to_grandchild_enemies: [ person:reginald, person:sorcerer ], name: 'Adelaide' }, { grandparent_to_grandchild_enemies: [], name: 'Bear knight' }, { grandparent_to_grandchild_enemies: [ person:reginald, person:sorcerer ], name: 'Bertha' }, { grandparent_to_grandchild_enemies: [], name: 'Eagle knight' }, { grandparent_to_grandchild_enemies: [], name: 'Fish knight' }, { grandparent_to_grandchild_enemies: [], name: 'Mother of the Nobleman' }, { grandparent_to_grandchild_enemies: [ person:reginald, person:sorcerer ], name: 'Reginald' }, // ... and so on ... ]
We see a response for Adelaide, because she has grandparents who have grandchildren who have an enemy. The same goes for the other daughters as well as Reginald. The knights do not show anything because, although they are enemies of the sorcerer, they are not the grandchildren of anyone in particular in our database.
We learned how to create a schema for our record types a few chapters ago, and graph tables can be made SCHEMAFULL in the same way too. And without a schema, you can do anything you like.
That means that not only can we marry the Nobleman to more than individual, his new spouses don’t even have to be people!
CREATE cat:mr_meow, cat:mrs_meow; RELATE [cat:mr_meow, cat:mrs_meow]->married->person:the_nobleman; SELECT in, out FROM married WHERE in = person:the_nobleman OR out = person:the_nobleman;
Response[ { in: cat:mrs_meow, out: person:the_nobleman }, { in: person:the_nobleman, out: person:the_noblewoman }, { in: cat:mr_meow, out: person:the_nobleman } ]
Another interesting behaviour that we’d like to prevent is the fact that relations can be created in the same way as many times as we like, even for the same records:
CREATE dog:mr_bark; CREATE food:biscuit; RELATE dog:mr_bark->likes->food:biscuit; RELATE dog:mr_bark->likes->food:biscuit; RELATE dog:mr_bark->likes->food:biscuit; SELECT ->likes->food FROM dog;
When you query for Mr. Bark’s favourite food, the answer should be “biscuits”, not “biscuits biscuits biscuits”. That is exactly what the output below shows: three results when there should only be one.
Response[ { "->likes": { "->food": [ food:biscuit, food:biscuit, food:biscuit ] } } ]
We can define such tables to ensure that in
and out
must be unique, thus only having a single record ID in them.
You might think that DEFINE FIELD
is the way to do it using a statement like the following:
DEFINE FIELD in ON TABLE married UNIQUE;
But it’s actually done through a statement called DEFINE INDEX
! This is because uniqueness for databases is most easily done through the creation of an index, as opposed to a full table scan which would be much less efficient. Indexes can be created for other reasons such as improved performance and don’t need to be set as UNIQUE
, but at the moment we are focusing on creating a unique constraint so we will leave the discussion of indexes as a whole to another day.
In any case, the syntax starts in this way:
DEFINE INDEX index_name ON TABLE table_name FIELDS [field names]
We have the choice to add the UNIQUE
keyword at this point, which is what we want. You will probably find yourself using UNIQUE
frequently for fields related to emails, addresses, IDs, names, and anything else that only one record should have.
All together, our DEFINE INDEX
statement looks like this.
DEFINE INDEX only_unique ON TABLE likes FIELDS in, out UNIQUE;
When we execute this statement, SurrealDB returns an error - which is a good thing!
Response'Database index `only_unique` already contains [dog:mr_bark, food:biscuit], with record `likes:4v7y0y9424nysk5paagp`'
That is because the likes
table already contains more than one instance of dog:mr_bark
as in
and food_biscuit
at out
. We can SELECT * FROM likes LIMIT 2
to find two of the existing record IDs and give that to a DELETE
statement to get rid of them.
DELETE (SELECT * FROM likes LIMIT 2) RETURN BEFORE;
Now that the extra likes
tables are deleted, we can define the index.
DEFINE INDEX only_unique ON TABLE likes FIELDS in, out UNIQUE;
Having done so, it is now impossible for us to specify that Mr. Bark likes biscuits more than once.
RELATE dog:mr_bark->likes->food:biscuit;
Response'Database index `only_unique` already contains [dog:mr_bark, food:biscuit], with record `likes:4u0yky84lrspxwv35pre`'
Now what if we want to define a unique index on a relation of two equals, like in a friends_with
or married_to
table? In those relationships, it doesn’t matter who is in
or out
, because friendship isn’t about “befriending A” from one side and “being befriended by B” on the other.
We can start by defining a unique index on the in
and out
fields as before:
DEFINE INDEX only_unique ON TABLE friends_with FIELDS in, out UNIQUE;
However, because we are never certain who is in
or out
in a friendship, that means that we can befriend the same two records twice through friends_with
by switching the order around! It won’t be until the third RELATE
statement that it will violate the unique index that we defined.
RELATE cat:one->friends_with->cat:two; RELATE cat:two->friends_with->cat:one; RELATE cat:one->friends_with->cat:two;
Response-------- Query -------- [ { id: friends_with:nsjdgitzyg5tw6df1f9q, in: cat:one, out: cat:two } ] -------- Query -------- [ { id: friends_with:xmk8xkngz3mt9vpa87wd, in: cat:two, out: cat:one } ] -------- Query -------- 'Database index `only_unique` already contains [cat:one, cat:two], with record `friends_with:nsjdgitzyg5tw6df1f9q`'
Fortunately, there is a fairly straightforward way to solve this problem. Since graph tables can contain their own fields, we can add a field (we’ll call it key
) that holds both in
and out
, but sorted through the array::sort()
method. Since we are only using this field to check for uniqueness, we could also cast into a string which is easier to read than an array. Here is a quick demonstration of the output from this method.
RETURN [<string>[cat:one, cat:two].sort(), <string>[cat:two, cat:one].sort()];
Response[ '[cat:one, cat:two]', '[cat:one, cat:two]' ]
So all we have to do is define the field, and then add an index ta ensures that this field must be unique. Note that there is no need to specify the type because it is generated automatically, but you could add TYPE option<string>
if you like.
DEFINE FIELD key ON TABLE friends_with VALUE <string>[in, out].sort(); DEFINE INDEX only_unique ON TABLE friends_with FIELDS key UNIQUE;
With this unique key set up, we are now unable to create weird double friendships when a friendship has already been formed.
RELATE cat:one->friends_with->cat:two; RELATE cat:two->friends_with->cat:one;
Response-------- Query -------- [ { id: friends_with:g85n08h0kpwrnf6n3sui, in: cat:one, key: '[cat:one, cat:two]', out: cat:two } ] -------- Query -------- "Database index `only_unique` already contains '[cat:one, cat:two]', with record `friends_with:g85n08h0kpwrnf6n3sui`"
Well done! Here is the full dataset for your reference.
This concludes our first mini-project. Hopefully SurrealDB’s ->
and <-
syntax has become familiar over the last four chapters. If not, don’t worry: we still haven’t even approached the halfway point of the book and there are many similar queries to come.
user
?An index with UNIQUE
will do the trick. You’ll probably also want to make sure that the field is a string.
DEFINE FIELD email ON TABLE user TYPE string; DEFINE INDEX only_unique_emails ON TABLE user FIELDS email UNIQUE;
Hint: See if you can find a function that can make this work.
The function to make this assertion is found inside SurrealDB’s string functions. Using one of them called string::is::email()
will do the trick:
DEFINE FIELD email ON TABLE user TYPE string ASSERT $value.is_email();
Here’s some sample data to get you started.
CREATE star SET name = "The Sun"; CREATE planet SET name = "Saturn"; CREATE moon SET name = "Titan"; CREATE moon SET name = "Mimas"; RELATE (SELECT * FROM planet WHERE name = "Saturn")->orbits->(SELECT * FROM star WHERE name = "The Sun"); RELATE (SELECT * FROM moon WHERE name IN ["Titan", "Mimas"])->orbits->(SELECT * FROM planet WHERE name = "Saturn"); CREATE star SET name = "Proxima Centauri"; CREATE planet SET name = "Proxima Centauri b"; CREATE moon SET name = "Distant moon"; RELATE (SELECT * FROM planet WHERE name = "Proxima Centauri b")->orbits->(SELECT * FROM star WHERE name = "Proxima Centauri"); RELATE (SELECT * FROM moon WHERE name = "Distant moon")->orbits->(SELECT * FROM planet WHERE name = "Proxima Centauri b");
One way to do it is to start at the moon
records, follow the path to see what planets they orbit, and then follow the path once more to see what stars the planets orbit. You’ll now find yourself at the star
records, after which you can filter by name.
SELECT * FROM moon WHERE "The Sun" IN ->orbits->planet->orbits->star.name; SELECT * FROM moon WHERE "Proxima Centauri" IN ->orbits->planet->orbits->star.name;
The count()
function and two graph queries will do the trick here. One uses <-
to see what the planet is being orbited by, and the other uses ->
to see what it is orbiting.
SELECT * FROM planet WHERE count(<-orbits) > 1 AND "The Sun" in ->orbits->star.name;
You could do it with the following query that goes down one level, and then goes down two levels.
SELECT *, <-orbits<-planet.name, <-orbits<-planet<-orbits<-moon.name FROM star;
Note that you could use an alias here, but not using an alias will combine the structure of the two graph queries and make it clear which moons are orbiting which planets.
SELECT *, <-orbits<-planet.name AS planet_names, <-orbits<-planet<-orbits<-moon.name AS moon_names FROM star;