SurrealDB University is LIVE! Master the future of data

Chapter 5: A long road to travel Chapter 5: A long road to travel
Back to Courses

Chapter 5: A long road to travel

Time elapsed: 2w

You are feeling a little frustrated today.

Whenever you learn a new skill, the first stage is all newness and excitement. But the stage that follows isn’t so exciting. It’s when you realize just how much you don’t know, and how gigantic the task is ahead of you.

You still can’t see how your new skills will let you rebuild the world as it once was. All you can do is work with records. And each field can hold any type of data at all. Won’t software that relies on predictable data not know what to do with it?

And what about records that are related to each other? How do you query records that should be connected in some way? Shouldn’t you be able to query a town and find all its people too without needing a separate query?

It feels like you are missing out on something important.

Looking up, you see one of the pictures from one of the books that has given you such comfort during this time.

The people of the old world were so good at combining precision with beauty, at blending technology and art. How did they do it?

You take a sip of tea, put your feet up on the desk, and close your eyes. Hmm…

A few minutes pass. Suddenly your stubbornness kicks in, and you begin to feel better. There must be a way! The databases made by the ancients must have had easy ways to join related records together. There’s no way that they built such miracles without technology that was well designed. All you have to do is keep learning, and you’ll find the answer.

And in the meantime, why not make the process a little more fun? You open up one of the books and start reading the story. Perhaps the characters inside will give you some ideas for how to build something real this time.

Schema and relations

With our introduction to SurrealDB done, we are going to move on to a small project that takes place over the next four chapters. We are going to follow along with Aeon who is reading through a story called The Enchanted Knights in the hopes of learning how to work with data that is more predictable and more closely linked.

The story will provide Aeon (and us) with a framework in which we have some real data with some complexity to represent.

Let’s start reading the story to see what ideas it gives us. It begins with a single character:

There was once a rich nobleman who lived like a king. He spent his days celebrating and enjoying his vast fortune, and all who came to his parties were given the best food and entertainment.

Let’s stop right there, as we already have some decisions to make. How should we represent this character? A person table sounds good, but should we choose our own ID, like this?

CREATE person:the_nobleman;

Generally, we would not. Custom IDs in SurrealDB are most useful when we want to iterate over or use them somehow, such as when using record range queries that we saw once and will learn more about in Chapter 10:

CREATE |event:1..10| SET ...
SELECT * FROM event:1..5;

However, we are going to learn a lot of new syntax in these chapters, and using human-readable IDs will allow us to remove WHERE clauses and concentrate on the new syntax. So we will go with record IDs like person:the_nobleman.

Having decided that, let’s give this character a few fields to start:

CREATE person:the_nobleman SET name = "The Nobleman", class = "Count", money = 100000; -- Just a made-up number

So far so good. Now let’s read the next sentence in the story.

In doing so, he wasted his money, and eventually lost his fortune…

Oh no! Just look at him now.

Updating his money is fairly easy: all we need to do is use the UPDATE keyword, followed by SET for the money field.

CREATE person:the_nobleman SET name = "The Nobleman", class = "Count", money = 100000;
UPDATE person:the_nobleman SET money = 50;
Response to UPDATE statement
[ { class: 'Count', id: person:the_nobleman, money: 50, name: 'The Nobleman' } ]

This works well, but remember Aeon’s concern about predictable data? At the moment SurrealDB lets us set fields on the person table to anything at all! We could have done this instead.

CREATE person:the_nobleman SET name = "The Nobleman", class = "Count", money = 100000; UPDATE person:the_nobleman -- Whoops, updated with weather data SET money = { temperature: 15.5, location: "Toria" };
Response to UPDATE statement
[ { class: 'Count', id: person:the_nobleman, money: { location: 'Toria', temperature: 15.5f }, name: 'The Nobleman' } ]

And now we’ll get an error if we try to update the Nobleman with a bit of extra money later on.

CREATE person:the_nobleman SET name = "The Nobleman", class = "Count", money = 100000; UPDATE person:the_nobleman SET money = { temperature: 15.5, location: "Toria" }; UPDATE person:the_nobleman SET money = money + 10;
Response to UPDATE statement
"Cannot perform addition with '{ location: 'Toria', temperature: 15.5f }' and '10'"

This is exactly what Aeon was worried about. We already have an idea of what a person table’s fields should and shouldn’t be, so let’s put a schema together to ensure that abnormal behaviour doesn’t happen.

Schema basics

Do you remember the INFO FOR DB statement we learned in the first chapter? Inside the output for this statement is a list of tables. One was automatically created for us after the CREATE person statement above, showing that it has no defined schema:

tables: { person: 'DEFINE TABLE person TYPE ANY SCHEMALESS PERMISSIONS NONE', }

At this point, there are two ways to make our schema more strict.

  • Defining a table as SCHEMAFULL. This is the most strict method. After a table is made SCHEMAFULL, fields can only be added through DEFINE FIELD statements.
  • Only using DEFINE FIELD statements. This defines the fields of a table one at a time, but leaves the table itself schemaless.

We’ll choose the first method so that we can understand how SurrealDB works at its strictest point.

Making the person table schemafull is easy: all we need is a simple DEFINE TABLE statement.

Let’s try copying and pasting the statement we saw in the INFO FOR DB statement but changing SCHEMALESS TO SCHEMAFULL;

CREATE person;
DEFINE TABLE person TYPE ANY SCHEMAFULL PERMISSIONS NONE;

Close, but not quite! The table already exists in the database, so SurrealDB won’t let us use DEFINE on its own to define it.

"The table 'person' already exists"

However, we can add the word OVERWRITE to force SurrealDB to accept this as the new definition.

CREATE person;
DEFINE TABLE OVERWRITE person TYPE ANY SCHEMAFULL PERMISSIONS NONE;

And now person is schemafull! But we haven’t specified any fields, making it incredibly strict. A person table won’t be able to have any fields at all, besides id.

So if we try to create a person with a name property, will the query work or fail?

DEFINE TABLE OVERWRITE person TYPE ANY SCHEMAFULL PERMISSIONS NONE;
CREATE person SET name = "Random person";

Interestingly, it works. SurrealDB will simply look over the fields we try to SET, not find them in the schema, and return a record with the only field that any and all records have: an id.

Response
[ { id: person:zndj3pw8kqkoth7sa09y } ]

Now that our DEFINE TABLE statement is done, we will need to use some DEFINE FIELD statements so that we can add fields to our person records. DEFINE FIELD is followed by ON, the table name, and then — if you want — TYPE with the type a field must be. We’ll give this a try:

CREATE person; DEFINE TABLE OVERWRITE person TYPE ANY SCHEMAFULL PERMISSIONS NONE; DEFINE FIELD name ON table person TYPE string; DEFINE FIELD class ON table person TYPE string; DEFINE FIELD money ON table person TYPE int;

And now we’ll try to create our random person again:

CREATE person SET name = "Random person";
Response
'Found NONE for field `class`, with record `person:dqd4znq4rlcqyximul1i`, but expected a string'

The new type strictness is pretty nice, but the fields class and money could be a little more lenient. It’s likely that we will start adding person records that don’t have this information, so let’s set the schema so that class must be a string if it exists, and money must be an int if it exists. To do this, we can turn TYPE string and TYPE int into TYPE option<string> and TYPE option<int>.

DEFINE FIELD OVERWRITE class ON TABLE person TYPE option<string>;
DEFINE FIELD OVERWRITE money ON TABLE person TYPE option<int>;

The keywords DEFAULT and VALUE are two other possibilities that you might want to think about when defining a field. DEFAULT lets you set a default value, in which case you wouldn’t need a type to be an option, while VALUE simply declares what the value will be.

We wouldn’t want to use VALUE 0 for the money field, because then everybody’s money would always be set to 0.

DEFINE FIELD OVERWRITE money ON TABLE person TYPE int VALUE 0;

But DEFAULT 0 could work, which means that everyone gets 0 unless specified otherwise.

DEFINE FIELD OVERWRITE money ON TABLE person TYPE int DEFAULT 0;

However, option makes the most sense for us for both of these fields, because as we simply might not know a person’s class or amount of money and don’t want to force a person record to have a value in that case.

With the optional fields in place, we can now create a person who has a name and nothing else.

DEFINE FIELD name ON table person TYPE string; DEFINE FIELD OVERWRITE class ON TABLE person TYPE option<string>; DEFINE FIELD OVERWRITE money ON TABLE person TYPE option<int>; CREATE person SET name = "Random person";
Response
[ { id: person:hnufh8xe74g94y8u0zyy, name: 'Random person' } ]

Schema assertions

We can make our schema even more strict if we like by adding assertions. Assertions let us check to see if data is the way it should be, while erring if it doesn’t match our expectations.

For example, you might want to assert that a number is greater than zero, or that a name is less than 30 characters in length.

Assertions can be created by adding ASSERT after defining a field. Inside a DEFINE FIELD statement you have access to the value of the field through a parameter called $value. And if you need to involve the table that the field is defined on, you can access it using the parameter $this.

What sort of assertions would we like to use?

Well, a person’s name probably shouldn’t be something like ”😊“. There is a function called string::is::ascii() that could work, but names can be accented so let’s not limit them to ASCII. Fortunately, there is another function called string::is::alpha() that will tell us if a string is alphabetic or not.

Now, there is one problem with using this function. Can you see it?

RETURN "Adrian".is_alpha();
RETURN "Adrian Fahrenheit Ţepeş".is_alpha();
Response
true false

A blank space is not an alphabetic character, so the second query returned false. To solve this, we have another function called string::replace() that can help us. This function takes a string to work on, followed by a string of a pattern to look for, and a third string to replace it with. Let’s use it twice to remove spaces and hyphens.

RETURN "Adrian Fahrenheit Ţepeş".replace(" ", "").replace("-", "");
Response
'AdrianFahrenheitŢepeş'

And now, all we have to do is follow this output with the original string::is::alpha() function! Now names with spaces and hyphens will be recognized as valid input.

"Adrian Fahrenheit Ţepeş" .replace(" ", "") .replace("-", "") .is_alpha(); "Karl-Theodor Maria Nikolaus Johann Jacob Philipp Franz Joseph Sylvester Buhl-Freiherr von und zu Guttenberg" .replace(" ", "") .replace("-", "") .is_alpha();

Finally, we can move this assertion to the DEFINE FIELD statement for the person table:

DEFINE FIELD OVERWRITE name ON TABLE person TYPE string ASSERT $value.replace(" ", "").replace("-", "").is_alpha();

Now we aren’t permitted to create person records with numbers, emojis and anything else that isn’t alphabetic:

DEFINE FIELD OVERWRITE name ON TABLE person TYPE string ASSERT $value.replace(" ", "").replace("-", "").is_alpha(); CREATE person SET name = '😊'; CREATE person SET name = '11'; CREATE person SET name = '***';

All of these return the same error:

"Found '😊' for field `name`, with record `person:epi0ihio6tg788b7fp15`, but field must conform to: $value.replace(' ', '').replace('-', '').is_alpha()"

Let’s make one more assertion for practice: that money must be at least zero.

DEFINE FIELD OVERWRITE money ON TABLE person TYPE option<int> ASSERT $value >= 0;

Fortunately, SurrealDB is smart enough to know that NONE must always be an acceptable value for an option. Thanks to that, the assertion only needs to be $value >= 0 and not $value IS NONE OR $value >= 0. This extra smartness was added in SurrealDB 2.0, so you may still see ASSERT $value IS NONE OR... in some .surql code written for previous versions.

With the above assertion we can now make a person with unspecified money, or a person with money, but not a person with negative money.

DEFINE FIELD OVERWRITE money ON TABLE person TYPE option<int> ASSERT $value >= 0; CREATE person SET name = "Brother of Nobleman"; UPDATE person SET money = -1 WHERE name = "Brother of Nobleman";
Response
[ { id: person:sgmyqajk13q191qo50bv, name: 'Brother of Nobleman' } ] 'Found -1 for field `money`, with record `person:rtw1gp8cxac8b5l9dutl`, but field must conform to: $value >= 0 OR $value = NONE'

Relations

We learned quite a bit from reading just a single line of the story! It’s finally time to get to the next line:

All he had left was an old castle, his wife, and three daughters: Wulfield, Adelaide, and Bertha.

‎ The nobleman became unhappy and spent his days inside his empty castle in a bad mood.

This brings us to one of SurrealDB’s greatest strengths: relations. Let’s first think about the relations that the Nobleman has.

  • He owns one castle, which is a property.
  • He is connected to four other people: his wife and three daughters.

His wife is also connected to him, and his wife is connected to the castle and daughters in the same way. Visually, the connection between all of these records would look something like this.

We’ll start with the castle. A castle is one type of building, so maybe we could create a building table that can be either a castle, or a house, or something else. We don’t know much about the Nobleman’s castle, but we can make it a building that has a name and a kind:

CREATE building:old_castle SET name = "Old castle", kind = "castle";
Response
[ { id: building:old_castle, kind: 'castle', name: 'Old castle' } ]

Since we know how to define a table and its fields, let’s do that here. First we’ll define the table as SCHEMAFULL and define its name field:

DEFINE TABLE OVERWRITE building SCHEMAFULL;
DEFINE FIELD name ON TABLE building TYPE string;

Next is the kind field. It would be nice to ensure that this kind can only be a certain value, like a house, castle, and so on. We wouldn’t like to have a building that has "cat" or '2024-06-24T05:31:17.715Z' as its building kind.

We can do that by using SurrealQL’s built-in CONTAINS or IN keywords, which are mirror images of each other.

RETURN ["cat", "dog"] CONTAINS "dog";
RETURN "dog" IN ["cat", "dog"];
Response
true true

Knowing this keyword, we can define our kind field:

DEFINE FIELD kind ON TABLE building TYPE string ASSERT ["house", "castle"] CONTAINS $value;

As planned, we are now unable to create a building that is a cat!

CREATE building SET name = "Mr. Meow", kind = "Cat";
Response
"Found 'Cat' for field `kind`, with record `building:wwpx5cuxszrrx3pa6te7`, but field must conform to: ['house', 'castle'] CONTAINS $value"

Now for the interesting part. How do we link the Nobleman (a person) to the castle (a building)?

In SurrealDB, there are two main ways to do this. The first method is to use record links.

A record link in SurrealDB is pretty simple: it’s a field that contains one or more record IDs. If you have a record with such a field, you can directly access the records they link to and the data inside.

Let’s give this a try with two schemaless tables first, a cat and its two types of catfood.

First we’ll create the two types of cat food:

CREATE catfood:dry SET name = 'Dry food';
CREATE catfood:wet SET name = 'Wet food';

And then a cat that is linked to them. The linking couldn’t be easier!

CREATE cat SET name = "Mr. Meow", foods = [catfood:dry, catfood:wet];
Response
[ { foods: [ catfood:dry, catfood:wet ], id: cat:gskuz7cevga8cq30hhp2, name: 'Mr. Meow' } ]

The foods will now show up in a query in the same way that another field will:

SELECT name, foods FROM cat;
Response
[ { foods: [ catfood:dry, catfood:wet ], name: 'Mr. Meow' } ]

We can also use the . operator to walk these links and access its fields too. As you can see, adding record links is almost like turning multiple records into a single record.

SELECT name, foods.name FROM cat;
Response
[ { foods: { name: [ 'Dry food', 'Wet food' ] }, name: 'Mr. Meow' } ]

Note that foods.name has maintained the path we took to reach that information: first through “foods”, then into “name”, which finally leads to an array of strings. If you want to collapse this structure, you can use the AS keyword to create an alias.

SELECT name, foods.name AS foods FROM cat;

And now both foods and name are on the same level.

Response
[ { foods: [ 'Dry food', 'Wet food' ], name: 'Mr. Meow' } ]

So that was how to create record links when we don’t have a schema. But with our SCHEMAFULL person record, we will need to define this field ahead of time. What type should we give it? Let’s think:

  • Not everybody owns property, so it should be an option.
  • People can have more than one property, so we want an array.
  • And the type name for record IDs is record<table_name>, so this array will hold a number of record<building>.

All together, that means an option<array<record<building>>>. The type name is a bit long, but just means “multiple building IDs that might or might not exist”. The definition looks like this:

DEFINE FIELD properties ON TABLE person TYPE option<array<record<building>>>;

With that done, we can now link the Nobleman to his one lonely castle. All we have to do is put the castle ID inside an array and we are done!

UPDATE person:the_nobleman SET properties = [building:old_castle];

Now let’s do a query on the Nobleman to show the pitiful state that he is in.

SELECT name, money, properties.name AS castles FROM person;
Response
[ { castles: [ 'Old castle' ], money: 50, name: 'The Nobleman' } ]

Poor nobleman!

If we want to see every field of the Nobleman’s properties, we can use the star operator in a different way.

If the properties field on person were a single record link, we could change properties.name to properties.* to show all of the fields of the building record.

SELECT name, money, properties.* FROM person;
Response
[ { money: 50, name: 'The Nobleman', properties: [ building:old_castle ] } ]

However, because the properties field is an array of record links, we need to add another .* to the end. The first .* means to access each item in the array, and the second means to access every field for each of those items.

SELECT name, money, properties.*.* FROM person;

SurrealDB also includes a keyword FETCH inside SELECT statements which is used to fetch and replace these fields and nested fields.

SELECT name, money, properties FROM person FETCH properties;

Both of these queries return the following output.

[ { money: 50, name: 'The Nobleman', properties: [ { id: building:old_castle, kind: 'castle', name: 'Old castle' } ] } ]

You’ll find the FETCH keyword particularly convenient when you are using the * operator to access all the fields of a record and want to include all of the fields of its record links instead of just their IDs.

SELECT * FROM person FETCH properties;

This brings us to an interesting discussion, because record IDs only go in one direction. You can go from person to building, but not from building to person because building doesn’t have an owner field unless we add it.

That’s fine for person and building relationship, but what about person to person? If the Nobleman is a person who might have record IDs to his daughters (perhaps under a field named daughters), what about the records for his daughters? Would the person table now also need a parents field?

Fortunately, SurrealDB has a second method to link records together called a graph table, and that allows us to create tables based on the relation between two tables. And it’s this relation that allows bidirectional querying.

That will be a longer discussion, so we will get into it in the next chapter.

INFO FOR TABLE and properties inside arrays

We’ll end the chapter wih a quick hint about the INFO statement.

In addition to the INFO FOR DB statement that we have used a number of times already, there is also INFO FOR table that lets us see all of the statements used to define our tables.

Right now, our tables only have a few defined fields. INFO FOR TABLE building shows the kind and name fields that we have defined:

fields: { kind: "DEFINE FIELD kind ON building TYPE string ASSERT ['house', 'castle'] CONTAINS $value PERMISSIONS FULL", name: "DEFINE FIELD name ON building TYPE string PERMISSIONS FULL" }

Nothing surprising there.

INFO FOR TABLE person is a little more interesting, however. Do you notice anything unexpected here? Take a look at the last two statements:

fields: { class: 'DEFINE FIELD class ON person TYPE option<string> PERMISSIONS FULL', money: 'DEFINE FIELD money ON person TYPE option<int> ASSERT $value >= 0 OR $value = NONE PERMISSIONS FULL', name: "DEFINE FIELD name ON person TYPE string ASSERT $value.replace(' ', '').replace('-', '').is_alpha() PERMISSIONS FULL", properties: 'DEFINE FIELD properties ON person TYPE option<array<record<building>>> PERMISSIONS FULL', "properties[*]": 'DEFINE FIELD properties[*] ON person TYPE record<building> PERMISSIONS FULL' }

The properties field at the very end contains two statements instead of one! This is expected behaviour, however: here SurrealDB is simply defining a field called properties that is an array, along with a properties[*] field to define the type inside the array.

We can experiment with this a little by quickly creating a SCHEMAFULL table that contains a single field that is an array of an array of an array of an array<int>.

DEFINE TABLE sometable SCHEMAFULL;
DEFINE FIELD lots_of_arrays ON sometable TYPE array<array<array<array<int>>>>;

Now let’s check the output from INFO FOR TABLE sometable. Our single DEFINE FIELD has been turned into five DEFINE statements! And the statement at the very end is the one that defines the type that the arrays actually contain, a simple int.

DEFINE TABLE sometable SCHEMAFULL; DEFINE FIELD lots_of_arrays ON sometable TYPE array<array<array<array<int>>>>; INFO FOR TABLE sometable;
{ events: {}, fields: { lots_of_arrays: 'DEFINE FIELD lots_of_arrays ON sometable TYPE array<array<array<array<int>>>> PERMISSIONS FULL', "lots_of_arrays[*]": 'DEFINE FIELD lots_of_arrays[*] ON sometable TYPE array<array<array<int>>> PERMISSIONS FULL', "lots_of_arrays[*][*]": 'DEFINE FIELD lots_of_arrays[*][*] ON sometable TYPE array<array<int>> PERMISSIONS FULL', "lots_of_arrays[*][*][*]": 'DEFINE FIELD lots_of_arrays[*][*][*] ON sometable TYPE array<int> PERMISSIONS FULL', "lots_of_arrays[*][*][*][*]": 'DEFINE FIELD lots_of_arrays[*][*][*][*] ON sometable TYPE int PERMISSIONS FULL' }, indexes: {}, lives: {}, tables: {} }

These fields marked with [*] are automatically generated by SurrealDB, so expect to see them in an INFO FOR TABLE statement if you have an array. You don’t ever have to type them out yourself.

Practice time
1. How would you make sure that a field on a table is a string composed of only numbers?
Hint

Check SurrealDB’s string functions to see if you can find a function to do the job.


Answer

The function string::is::numeric() will do the job. Calling it on $value will have it check the value of the field itself to ensure that the input is all numeric.

DEFINE FIELD numstring ON TABLE user ASSERT $value.is_numeric();

Answer

You could do this by creating a user, then a comment record and then updating it. You could use the += syntax which will be recognized as an array, or outright create an array by putting the comment inside [].

CREATE user:one SET name = "One"; CREATE comment:one SET content = "Go local sports team!"; UPDATE user:one SET comments += comment:one; // Or: UPDATE user:one SET comments = [comment:one]; SELECT name, comments.content AS comments FROM user;

3. For some reason you wanted a table that can only hold a simple ID, but then noticed that it can still hold complex data.
DEFINE TABLE no_data SCHEMAFULL; CREATE no_data:{ actually: "Has", a_lot_of: "data" }; CREATE no_data:["lots", "of", "daaaataaaa"];

What can you do?

Answer

Since complex record IDs are either objects or arrays, we can assert that an ID is not either of these types. Note the ! in front of the function, which means “not”.

The function record::id() gives us access to the id part of the record ID, while the functions type::is::object() and type::is::array() let us ensure that this id isn’t either of these two types.

DEFINE FIELD id ON no_data ASSERT !$value.id().is_object() AND !$value.id().is_array();

4. If you had a person table with the fields first_name and last_name, how could you automatically generate a full_name using them?
Answer

Since $this gives access to the parent table, we can use that.

DEFINE FIELD first_name ON TABLE person TYPE string; DEFINE FIELD last_name ON TABLE person TYPE string; DEFINE FIELD full_name ON TABLE person VALUE $this.first_name + ' ' + $this.last_name;

Answer

Nope! As long as there is data at the end of the path, there is no limit to the number of . used to reach it.

For a fun demonstration, see the code below that creates ten people and gives each of them two friends. The function rand::enum() is used here to make a random pick from the elements of an array.

CREATE |person:1..5|; FOR $person IN (SELECT id FROM person) { LET $all_people = (SELECT VALUE id FROM person); UPDATE $person SET friends += rand::enum($all_people); UPDATE $person SET friends += rand::enum($all_people); };

Once they are created, you can see their friends, their friends’ friends, and friends’ friends’ friends, and so on.

SELECT id, friends, friends.friends AS second_degree, friends.friends.friends AS third_degree FROM person;

This code was meant to be as easy as possible to read in this early chapter. For a challenge, see if you can find out how to only allow people to be friends with one person once, and to disallow people being friends with themselves.