SurrealDB University is LIVE! Master the future of data

Chapter 3: Gone in an instant Chapter 3: Gone in an instant
Back to Courses

Chapter 3: Gone in an instant

Time elapsed: 1d

You learned quite a bit about SurrealDB yesterday, shut off the computer and went to bed feeling quite satisfied. This technology has so much potential!

But the next day welcomed you with a most unpleasant surprise: all the data you created is now gone!

It was all just experimental data, but still…

Though you have been using computers for a full year now, the idea of losing that much information in a single moment is still a shock. Computers can both create and destroy data at such an incredible rate.

As a child, you once heard about a library in the mainland across from Toria that went up in flames. Though they managed to save some of the books, the rest were reduced to ashes. The thought of losing so much knowledge in an instant still shocks you. And with computers, one false keystroke and everything is just gone!

You calm yourself and begin looking through the documentation again. You soon find out why your data disappeared: the database you started yesterday with surreal start --user root --pass root only kept the data in memory! That means that the information was never saved to a file. It looks like SurrealDB has a number of solutions to storing data over the long term…

SurrealDB architecture

SurrealDB is divided into two layers: a query layer (or a compute layer), and a storage layer. These are kept separate from one another, because SurrealDB has a large number of options for the storage layer, but always the same behaviour when writing queries. This lets the database feel exactly the same no matter what storage layer is being used.

Thus far we have used either the Sandbox inside Surrealist, or SurrealDB locally with commands like the following.

surreal start --user root --pass root

This command has always created an in memory database for us to use, which disappears every time we disconnect from the database. We could have also typed the following to have the same effect:

surreal start --user root --pass root memory

But memory is the default option, so SurrealDB will use in-memory storage even if you don’t specify it.

As we noted in Chapter 1, SurrealDB will let us know which storage layer is being used when it starts up. Here is the startup output again:

2024-06-14T02:21:20.178755Z INFO surreal::env: Running 2.0.3 for windows on x86_64 2024-06-14T02:21:20.178923Z INFO surrealdb_core::kvs::ds: Starting kvs store in memory 2024-06-14T02:21:20.179000Z INFO surrealdb_core::kvs::ds: Started kvs store in memory 2024-06-14T02:21:20.179599Z INFO surrealdb_core::kvs::ds: Credentials were provided, and no root users were found. The root user 'root' will be created 2024-06-14T02:21:20.217989Z INFO surrealdb::net: Started web server on 127.0.0.1:8000

SurrealDB offers a large number of options for storage, but their use cases are pretty clear so making a decision between them is not usually that hard.

Some of the options are:

  • RocksDB: Simple, performant storage to disk.
    • SurrealKV: A low-level, persistent, embedded key-value database implemented by SurrealDB in Rust. Developed recently by SurrealDB itself, it offers some interesting features like versioning (being able to see a table’s data at a certain date).
  • IndexedDB: Storage inside the user’s web browser.
  • TiKV: Storage when using distributed mode (storage over multiple computers or a network of computers).
  • FoundationDB: Another distributed option with similar use cases as TiKV.

If you want to simply keep your data on your computer even after disconnecting, then SurrealKV or RocksDB are the easiest ways to do that.

To go with this choice, we can replace memory with some other path, depending on our choice of storage. For SurrealKV, you can use surrealkv: followed by any file name we would like to give our database, like surrealkv:surrealbook.db. Let’s try that out!

surreal start --user root --pass root surrealkv:surrealbook.db

You’ll see that the Starting kvs store in memory has changed to Starting kvs store at surrealkv://surrealbook.db. Now let’s open up Surrealist, connect to 127.0.0.1:8000 (or localhost:8000), and do a very simple query:

CREATE person:aeon;

Now use ctrl-c to disconnect from the database and reconnect again. This time you’ll see some interesting in the output:

2024-02-27T06:12:36.502417Z WARN surrealdb_core::kvs::ds: Credentials were provided, but existing root users were found. The root user 'root' will not be created 2024-02-27T06:12:36.502591Z WARN surrealdb_core::kvs::ds: Consider removing the --user and --pass arguments from the server start command

That was pretty nice of SurrealDB to tell us. At this point we already had an existing database with persistent storage and a user named root, so we didn’t need to specify --user root --pass root. Good to know!

And that means that our person record should be inside as well. Let’s give it a try.

SELECT * FROM person;
Response
[ { id: person:aeon } ]

It worked!

Now that we know the basics of storage, let’s get back to some queries. No chapter in the rest of the book will assume that you are using persistent storage, so feel free to go back to the Surrealist sandbox or SurrealDB in memory if you prefer.

The AS keyword

The AS keyword lets you give an alias (an alternate name) to a field. This keyword is used a lot in SurrealDB, for a good reason.

First let’s look at what life would be like without AS. The town of The Naimo (close to Toria, where Aeon lives) currently has a population of 7490. Let’s insert it:

CREATE town:the_naimo SET name = "The Naimo", population = 7490, data = { location: "Northwest of Toria", geography: "Coastal town" };
Response
[ { data: { geography: 'Coastal town', location: 'Northwest of Toria' }, id: town:the_naimo, name: 'The Naimo', population: 7490 } ]

Easy enough!

Now let’s add some weather data to a query and return that, along with some other random fields to see what they look like without an alias. The weather is cloudy with a temperature of 10.5 degrees. We’ll also take the town’s name and add “Weather for ” in front to make “Weather for The Naimo”.

SELECT population, "Weather for " + name, 10.5, "Cloudy" FROM town:the_naimo;

But the output is pretty ugly!

Response
[ { "'Weather for ' + name": 'Weather for The Naimo', "10.5f": 10.5f, Cloudy: 'Cloudy', population: 7490 } ]

That’s because SurrealDB will create the name of a field on its own from the operation used to make it. And since “Weather for The Naimo” was created by adding one string to another, the field name is just as long as its value! Even 10.5 has a somewhat ugly field name that adds an f to mark it as a float.

Fixing this is easy: just append AS and choose a name for each field.

SELECT population, "Weather for " + name AS name, 10.5 AS temperature, "Cloudy" AS conditions FROM town:the_naimo;

Much nicer!

Response
[ { conditions: 'Cloudy', name: 'Weather for The Naimo', population: 7490, temperature: 10.5f } ]

One thing to note when using an alias: make sure that your alias doesn’t overwrite another field that you would like to display.

SELECT *, "Name: " + name + ". Population: " + <string>population AS data FROM town:the_naimo;

The query works, but since we’ve outright told SurrealDB how it should output the field data in this query, it has no way to show the original data object that we inserted into the town:the_naimo record. It’s not gone, it just won’t show up in this query.

Response
[ { data: 'Name: The Naimo. Population: 7490', id: town:the_naimo, name: 'The Naimo', population: 7490 } ]

The easy solution is to choose a different name. But we could also give the original data field its own alias too. Let’s give that a try.

SELECT *, "Name: " + name + ". Population: " + <string>population AS data, data AS other_data FROM town:the_naimo;
Response
[ { data: 'Name: The Naimo. Population: 7490', id: town:the_naimo, name: 'The Naimo', other_data: { geography: 'Coastal town', location: 'Northwest of Toria' }, population: 7490 } ]

Updating records

Updating records is pretty easy if you already know how to use CREATE, because the syntax is almost the same. You can update a single record by specifying its record ID, or all the records of a table by specifying just the table name.

CREATE town:sukh SET name = "Sukh"; // Whoops, forgot to add the population... UPDATE town:sukh SET population = 2955; // This would set the population of both Sukh // and The Naimo to 2955. Probably don't want to do that UPDATE town SET population = 2955;

An UPDATE statement will show the records after they have been updated.

Response
-------- Query -------- [ { id: town:sukh, name: 'Sukh' } ] -------- Query -------- [ { id: town:sukh, name: 'Sukh', population: 2955 } ]

The largest major difference between the syntax of CREATE and UPDATE is that UPDATE uses a WHERE clause a lot, because:

  • You usually don’t want to update each and every record of a table.
  • Much of the time you won’t know the ID of the records you want to update, or the records of a table have IDs that are too complex to manually type.

So if we had created Sukh in this way…

CREATE town SET name = "Sukh";
Response
[ { id: town:9aljqiup3oj78ut5aum4, name: 'Sukh' } ]

…we would definitely want to update it with a WHERE name = "Sukh" instead of typing something like UPDATE town:9aljqiup3oj78ut5aum4.

CREATE town SET name = "Sukh";
UPDATE town SET population = 2955 WHERE name = "Sukh";
Response
[ { id: town:49nsodnvgfwmzbwf3grz, name: 'Sukh' } ]

UPSERT to either create or update records

An UPDATE statement on a record ID won’t create a record if it did not exist. Let’s give this a try with an airplane, which no longer exist in Aeon’s time and will certainly not be in the database at the moment.

UPDATE airplane SET name = "Aeon's plane";
UPDATE airplane:aeons_plane SET name = "Aeon's plane";

Each of these queries returns nothing, as expected. It would be nice though if you could treat this airplane:aeons_plane record ID as a pointer to data, to which you update if the record exists, and create if it does not.

This is what the UPSERT keyword is for! The name is a combination of UPDATE and INSERT, namely to UPDATE if a record exists, and to INSERT if it does not. SurrealDB uses the INSERT keyword as well, by the way, and we will encounter it for the first time in Chapter 7.

Let’s give the former two queries a try again with the new UPSERT keyword.

UPSERT airplane:aeons_plane SET name = "Aeon's plane";
UPSERT airplane:aeons_plane SET seats = 20;

Now an airplane record is guaranteed to exist in either case.

Response
-------- Query -------- [ { id: airplane:aeons_plane, name: "Aeon's plane" } ] -------- Query -------- [ { id: airplane:aeons_plane, name: "Aeon's plane", seats: 20 } ]

Perhaps Aeon will actually have an airplane like this one day.

Complex record IDs

Record IDs in SurrealDB can be quite complex. Besides numbers, strings, and randomly generated IDs, they can also be arrays and even objects!

Take the following query for example, which returns a town record with a random id and two fields.

CREATE town SET name = "Sukh";
Response
[ { id: town:5rhoivu4iks9dx7npv06, name: 'Sukh' } ]

What if you wanted the same information as an object inside the id instead? You can do this by putting the object after the : in the same way that we have done with all other record IDs:

CREATE town:{ name: 'Sukh' };
Response
[ { id: town:{ name: 'Sukh' } } ]

Note the difference between these two records. They contain the same information, but the latter object holds the information in the record ID itself. So a query for the fields name and id will show all two for the first record, but only id for the second.

But any fields inside a complex ID are accessible by using the . dot operator, as this example shows.

CREATE town:{ name: 'Sukh' };
SELECT id.name FROM town;
Response
[ { id: { name: 'Sukh' } } ]

One common reason to use complex record IDs is that SurrealDB’s range operator works on record IDs. This operator is written using .. for an exclusive range (up to but not including the end of the range), and ..= for an inclusive range (up to and including the end of the range).

Here is a quick example of this in practice. The following example adds two more towns with complex record IDs, and then uses a few SELECT statements with all of the possible ways to use the range operator. Note that a range query even works without telling SurrealDB the end of the range, in which case it will be an open-ended range query.

CREATE town:{ name: 'Toria' }, town:{ name: 'Black Bay' }; // Everything starting from 'S' SELECT * FROM town:{ name: 'S' }..; // Everything up to but not including 'Toria' SELECT * FROM town:..{ name: 'Toria' }; // Everything up to and including 'Toria' SELECT * FROM town:..={ name: 'Toria' }; // From 'Black Bay' and up to but not including 'Toria' SELECT * FROM town:{ name: 'Black Bay' }..{ name: 'Toria' }; // From 'Black Bay' and up to and including 'Toria' SELECT * FROM town:{ name: 'Black Bay' }..={ name: 'Toria' };
Response
-------- Query -------- [ { id: town:{ name: 'Toria' } } ] -------- Query -------- [ { id: town:{ name: 'Black Bay' } } ] -------- Query -------- [ { id: town:{ name: 'Black Bay' } }, { id: town:{ name: 'Toria' } } ] -------- Query -------- [ { id: town:{ name: 'Black Bay' } } ] -------- Query -------- [ { id: town:{ name: 'Black Bay' } }, { id: town:{ name: 'Toria' } } ]

How record ranges work internally is a bit of a long discussion that we will get to in Chapter 10.

SurrealDB’s range syntax also works to construct a range of integers, which can be quite convenient. Here is a quick example of a range used to check a value.

// Two bool checks RETURN 5 > 0 AND 5 < 10; // Or just use a range RETURN 5 IN 0..10;

Both queries return true.

A range can also be a nice way to quickly construct an array. To do so, just use <array> to perform the cast.

-- An array from 0 to 3, not including 4 RETURN <array>0..4; -- Get the item at index 30 from this array RETURN (<array>50..1000)[30];
Response
-------- Query -------- [ 0, 1, 2, 3 ] -------- Query -------- 80

The NULL and NONE types

In the previous section we created one town with a random ID, as well as three others with complex record IDs. What happens if we try to select the name field of a record that doesn’t have these fields? Will SurrealDB just return an error?

It will not. Instead, the records that don’t have a name field will show the output NONE in its place.

CREATE town SET name = 'Sukh'; CREATE town:{ name: 'Black Bay' }, town:{ name: 'Sukh' }, town:{ name: 'Toria' }; SELECT name, id FROM town;
Response
[ { id: town:vp8ua3oxu5kq9ayckwjy, name: 'Sukh' }, { id: town:{ name: 'Black Bay' }, name: NONE }, { id: town:{ name: 'Sukh' }, name: NONE }, { id: town:{ name: 'Toria' }, name: NONE } ]

If you have used SQL before, you might be wondering why the output is NONE and not NULL. The answer is that SurrealDB does indeed have the type NULL to represent data that doesn’t exist, but the two types represent a lack of data in a different way.

The behaviour of NULL is also different from NULL in SQL. For example, a NULL in SQL is not equal to itself, but is in SurrealQL:

RETURN null = null;
Response
true

Rows containing NULL in SQL cannot be compared to each other, but can be in SurrealQL:

RETURN [null, 1] = [null, 1];
Response
true

And some databases return NULL when dividing by zero in order to represent an error, which is not the case in SurrealDB.

RETURN 1000 / 0;
Response
NaN

So what is the exact definition of null in SurrealDB?

  • NULL is a field that exists (a field that is “set”), but has no value.
  • NONE is for a field that does not exist at all. In other words, setting a field to NONE removes the field from the record.

This difference is pretty easy to show. Let’s start by creating another town close to Toria called The Hill:

CREATE town:the_hill SET name = "The Hill";
Response
[ { id: town:the_hill, name: 'The Hill' } ]

And then give it a population:

CREATE town:the_hill SET name = "The Hill";
UPDATE town:the_hill SET population = 1125;
Response
[ { id: town:the_hill, name: 'The Hill', population: 1125 } ]

So far so good! Now let’s see what happens when we set this new field to null.

CREATE town:the_hill SET name = "The Hill"; UPDATE town:the_hill SET population = 1125; UPDATE town:the_hill SET population = null;
Response
[ { id: town:the_hill, name: 'The Hill', population: NULL } ]

Did you notice that the population field is still showing up? That’s because it was set to NULL, so it is still set, but has no value.

We can completely remove it by setting it to NONE:

CREATE town:the_hill SET name = "The Hill"; UPDATE town:the_hill SET population = 1125; UPDATE town:the_hill SET population = NONE;
Response
[ { id: town:the_hill, name: 'The Hill' } ]

And if we do a query to show all of the fields for our single town record, we can see that the town field does not show up, because it doesn’t exist anymore.

CREATE town:the_hill SET name = "The Hill"; UPDATE town:the_hill SET population = 1125; UPDATE town:the_hill SET population = NONE; SELECT * FROM town:the_hill;
Response
[ { id: town:the_hill, name: 'The Hill' } ]

SurrealDB also has an UNSET keyword that is the same as setting a field to NONE. This keyword is not too frequently used but can be convenient when you need to remove a lot of fields at the same time.

CREATE town:the_hill SET name = "The Hill", population = 1125;
UPDATE town UNSET name, population;
Response
[ { id: town:the_hill } ]

We learned a lot in this chapter! It’s almost time for us to learn the DELETE keyword, but Aeon is still too afraid to try it out. Looks like we will have to wait until the next chapter to do so.

Practice time
1. The output of the following SELECT statement is a bit ugly. How can you improve it?
SELECT name, surname, name + ' ' + surname FROM { name: "Bob", surname: "Bobson" };
Response
[ { name: 'Bob', "name + ' ' + surname": 'Bob Bobson', surname: 'Bobson' } ]
Answer

As the name + '' + surname part is clearly constructing a full name, we can add AS full_name to give it an alias.

SELECT name, surname, name + ' ' + surname AS full_name FROM { name: "Bob", surname: "Bobson" };
Response
[ { full_name: 'Bob Bobson', name: 'Bob', surname: 'Bobson' } ]

Just be sure not to give it an alias like name, as that will cause the original name field to not show up.

SELECT name, surname, name + ' ' + surname AS name FROM { name: "Bob", surname: "Bobson" };
Response
[ { name: 'Bob Bobson', surname: 'Bobson' } ]

2. What keyword would you use if you want to CREATE a record that you think might already exist?
Answer

You can use an UPSERT statement in this case.

CREATE person:aeon SET name = "Aeon", has_airplane = false;
UPSERT person:aeon SET name = "Aeon", money = 10;

Note that UPSERT doesn’t give an option to modify the already existing record (the has_airplane field for example), but there is another keyword coming up in Chapter 7 that does.


3. How would you find every one of these towns with a population between 2000 and 9000?
CREATE town SET name = "Toria", population = 20685; CREATE town SET name = "Sukh", population = 2955; CREATE town SET name = "Black Bay", population = 4137;
Answer

A WHERE clause will do the trick. You can use the keyword AND to add a second condition.

SELECT * FROM town WHERE population > 2000 AND population < 9000;

Or you can create a range to do the same check.

SELECT * FROM town WHERE population IN 2000..9000;

4. What if you wanted to have the name and population properties of those towns as part of the record ID itself? How would you create them?
Answer

You can put them inside an object that you can use : to join to the table name.

CREATE town:{ name: "Toria", population: 20685 }; CREATE town:{ name: "Sukh", population: 2955 }; CREATE town:{ name: "Black Bay", population: 4137 };

5. How would you then return all of these towns if their name is greater than the value “M”?
Answer

You could use a WHERE clause on id.name, or add .. and an object with the name “M” for an open range record range query.

SELECT * FROM town WHERE id.name > "M";
SELECT * FROM town:{ name: "M" }..;

Note that the range operator syntax is more efficient than WHEN, which uses a whole table scan.