SurrealDB University is LIVE! Master the future of data

Gradient
SurrealDB University

Authentication Required

This course requires authentication.
Please sign in to continue

Background Gradient
Next
Back to Courses

Reading data

Now that we’ve created some data, it’s time to explore it.

In this lesson, we’ll cover how to use

  • The SELECT statement for both simple and advanced read operations, and how that compares to the RETURN statement.
  • The LET statement for creating and using parameters in our statements.
  • The LIVE SELECT statement for reading data in real-time as changes are made to the underlying table.

Starting with the SQL select basics

SELECT * FROM product; SELECT name FROM product; SELECT name AS product_name FROM product;

Starting with the foundations of a normal SQL SELECT statement:

  • We can select everything from the product table with SELECT * FROM product
  • We can also just select specific fields by naming them, such as SELECT name FROM product
  • Finally, we can use AS to rename fields, such as changing name to product_name

You’ll also find familiar SQL clauses such as WHERE, GROUP BY, ORDER BY and LIMIT.

Where

SELECT * FROM product
WHERE "Black Pink" IN colours;

We use the WHERE clause to filter for specific records, such as products which have the black pink colour.

Group by

SELECT product_name, count() AS number_of_orders, math::sum(price * quantity) AS sum_sales FROM order GROUP BY product_name;

GROUP BY is usually used alongside aggregate functions to aggregate data. Such as in this example where we are finding the number of orders and sales amount for each product.

SELECT count() FROM order;

It’s important to be a aware of that when counting the number of records in a table, most relational databases don’t require you to use a GROUP BY since the data will never be nested. Therefore the query planner can just assume you meant to use the entire table as one group.

SELECT count() FROM order
GROUP ALL;

In SurrealQL however, there is a lot more flexibility to model nested data. Therefore, we can’t make the same assumptions as a relational database. This means functions like count() always go record by record unless you specify GROUP ALL which will use the entire table as one group.

Order by

SELECT product_name, count() AS number_of_orders FROM order GROUP BY product_name ORDER BY number_of_orders DESC;

ORDER BY is used to sort the records, by default it sorts data in ascending order, but you can also specify DESC for descending order.

Limit

SELECT product_name, count() AS number_of_orders FROM order GROUP BY product_name ORDER BY number_of_orders DESC LIMIT 10;

Finally, the LIMIT clause is used to limit the number of records we get back from our query.

Moving onto more advanced features

Now that we’ve covered the foundations that most SQL-dialects have, let’s look into things specific to SurrealQL.

Selecting a single record or a range of records

SELECT name, email FROM person WHERE id = person:01FTP9H7BG8VDANQPN8J3Y857R; SELECT name, email FROM person WHERE id >= person:01FTP9H7BG8VDANQPN8J3Y857R AND id < person:01HG9EFC0R8DA8F87VNYP0CD8A;

In most SQL dialects, you’d need to use the WHERE clause to filter by IDs or fields such as time.

This is however anti-pattern in SurrealQL since it will do a table scan.

SELECT name, email FROM person:01FTP9H7BG8VDANQPN8J3Y857R; SELECT name, email FROM person:01FTP9H7BG8VDANQPN8J3Y857R..01HG9EFC0R8DA8F87VNYP0CD8A;

The best practice for selecting a single record or a range of records is using record IDs. This is because when using record IDs, we directly fetch the records from the key-value store, without doing a table scan. Making it orders of magnitude more performant than using the WHERE clause.

-- Using omit SELECT * OMIT time FROM person:01FTP9H7BG8VDANQPN8J3Y857R; -- Not using omit SELECT id, first_name, last_name, name, email, phone, address, address_history, payment_details FROM person:01FTP9H7BG8VDANQPN8J3Y857R;

SurrealQL also makes it easy select what you need from a record, such as in cases where you want to select everything, but omit certain fields from a record.

Working with objects and arrays

The SELECT statement in SurrealQL is extremely flexible, with many advanced features. You can find a more comprehensive list of these features in our documentation, but for now, let’s look at a few examples of how to work with objects and arrays.

-- Select the first colour in the colours array SELECT colours[0] FROM product:01FSXKCPVR8G1TVYFT4JFJS5WB; -- Select updated_at from the time object SELECT time.updated_at FROM product:01FSXKCPVR8G1TVYFT4JFJS5WB; -- Select the entire array of objects SELECT images FROM product:01FSXKCPVR8G1TVYFT4JFJS5WB; -- Select all the URLs in the images array of objects SELECT images.*.url FROM product:01FSXKCPVR8G1TVYFT4JFJS5WB; -- Select the first URL in the images array of objects SELECT images[0].url FROM product:01FSXKCPVR8G1TVYFT4JFJS5WB;

For selecting and traversing objects and arrays and arrays of objects, we can use the dot and bracket notation.

  • We can select the first colour in the colours array using colours[0]
  • We select updated_at from the time object using time.updated_at
  • For selecting the first URL in the images array of objects, we use a combination of both, images[0].url

object and array functions

-- Returns the unique items in an array SELECT array::distinct(sub_category) AS unique_sub_cat FROM product GROUP ALL; -- Flattens and returns the unique items in an array SELECT array::group(details) AS unique_details FROM product GROUP ALL;

Object and array functions are useful for many things such as deduplicating results similar to SELECT DISTINCT in most SQL-dialects. There are two ways of doing this in SurrealQL:

  • Using array::distinct, if the fields are not nested.
  • Using array::group, if you’re selecting from nested arrays and need to flatten them.
-- Select the product name and the number of colours it has SELECT name, array::len(colours) AS number_of_colours FROM product; -- Select the person name and the number of address fields SELECT name, object::len(address) AS number_of_address_fields FROM person;

Importantly, there is way to count things without using aggregate functions, if what you’re counting is either an array or object. As then we can use array::len or object::len and get both aggregated data and non-aggregated data in one query.

You can find many more useful functions in our documentation.

Tempfiles

SELECT count() AS number_of_orders, time::format(time.created_at, "%Y-%m") AS month, math::sum(price * quantity) AS sum_sales, currency FROM order GROUP BY month, currency ORDER BY month DESC TEMPFILES;

There might be times where you want to run large analytical that have the potential to cause an out-of-memory error (OOM). That is where the TEMPFILES clause comes in, allowing you to process the query in temporary files on disk rather than in memory. This significantly reduces memory usage, though it’s likely to also result in slower performance.

Using LET parameters and subqueries

-- Find the name of the product where the price is higher than the avg price SELECT name from product WHERE [price] > ( SELECT math::mean(price) AS avg_price FROM product GROUP ALL ).avg_price;

Subqueries function in similar ways as you’d expect from SQL-dialects, such as using them in the SELECT, FROM or WHERE clauses. We’ll cover them in more detail in part 2 on relational style joins.

-- Using the let statement to store the query result LET $avg_price = ( SELECT math::mean(price) AS avg_price FROM product GROUP ALL ).avg_price; -- Find the name of the product where the price is higher than the avg price SELECT name from product WHERE [price] > $avg_price;

An alternative to using subqueries is often through using common table expressions (CTEs).

SurrealQL does not use typical CTEs, but we can use the LET statement to cover those use cases.

LET $field_name = "name"; LET $table = "product"; LET $id = "01FSXKCPVR8G1TVYFT4JFJS5WB"; RETURN $table; SELECT type::field($field_name) FROM type::thing($table, $id)

We can also use the LET statement to parameterise our queries, either directly as static values, or dynamic values in combination with the type functions.

The LET statement can do a lot more than what we’ve covered here, as you can use it in almost every statement in SurrealQL and you can see more examples of that in our documentation.

How SELECT compares to RETURN

-- Return a number RETURN 1337; SELECT * FROM ONLY 1337;

Aside from the SELECT statement, you can also use the RETURN statement for reading data.

With it, you can RETURN any value, from numbers and strings to entire code blocks and query results.

In many ways, it’s similar to the SELECT statement but can offer a more ergonomic way of returning values. Such as in our example where we can return the value directly instead of having to use SELECT * FROM ONLY to return ONLY the value.

-- Return a select statement
RETURN (SELECT name FROM person);

When it comes to records, you can RETURN the result of any SELECT statement.

-- Return a record RETURN person:01FTP9H7BG8VDANQPN8J3Y857R.*; SELECT * FROM ONLY person:01FTP9H7BG8VDANQPN8J3Y857R; -- Return a field value inside the record RETURN person:01FTP9H7BG8VDANQPN8J3Y857R.name; SELECT VALUE name FROM ONLY person:01FTP9H7BG8VDANQPN8J3Y857R;

However, on its own, RETURN only allows for selecting a single record or just a single value in that record. This can also be accomplished with a combination of FROM ONLY and SELECT VALUE using the SELECT statement. I’d encourage you to experiment with the queries, removing the ONLY and or VALUE clause to see how the result changes.

How they are different

SELECT product_name, count() AS number_of_orders FROM order GROUP BY product_name ORDER BY number_of_orders DESC LIMIT 10;

The SELECT statement, as we’ve seen, has many ways of working data that is not just about returning values like the RETURN statement.

-- Start transaction BEGIN; -- Setup accounts CREATE account:one SET balance = 135605.16; CREATE account:two SET balance = 91031.31; -- Move money UPDATE account:one SET balance += 300.00; UPDATE account:two SET balance -= 300.00; -- Return new account balances RETURN { account_one_balance: account:one.balance, account_two_balance: account:two.balance }; COMMIT; -- Finish transaction

Unlike the SELECT statement, the RETURN statement has a special purpose in code blocks, functions and transactions for returning the final results.

You can see more examples of how to use the RETURN statement in code blocks, functions and transactions in our documentation.

Going real-time with LIVE SELECT

LIVE SELECT allows us to unlock streaming data magic, through what we call Live Queries.

Live Queries read data in real-time, as changes are made to the underlying table.

-- Start a Live Query LIVE SELECT * from product; -- Stop a Live Query by specifying its uuid KILL u"57f4964c-006f-463b-a965-19a3cec330b9"; -- Start a live query using JSON patch format LIVE SELECT DIFF from product;

When you run the LIVE SELECT it will return a UUID, which is the Live Query Unique ID. The UUID is used to keep track of the various Live Queries you have running and to stop them using the KILL statement.

Once you made changed to the table the Live Query is listening to, in our case the product table, you can see the live results updating.

You can either receive the changes in our normal format or use the JSON PATCH format by specifying LIVE SELECT DIFF FROM product.

You can see more details about Live Queries in our documentation.

Summary

The last query we’re running is SELECT summary FROM lesson to summarise what we’ve learned

  • The SELECT statement

    • Starts with the foundation of SQL, using familiar clauses such as WHERE, GROUP BY, ORDER BY and LIMIT.
    • Then has more advanced features, such as selecting directly from a record, or a range of records without doing a table scan. As well as working with objects and arrays.
    • It uses the TEMPFILES clause for processing the query in temporary files on disk rather than in memory. Typically used for very large queries that might otherwise give an out-of-memory error (OOM).
    • It uses the LET statement to create parameters, which also covers the use cases for common table expressions (CTEs).
  • The RETURN statement

    • Can RETURN any value, from strings and numbers to entire code blocks and query results.
    • It can be used instead of the SELECT statement for some use cases, such as a more ergonomic way of returning values, replacing the SELECT VALUE and/or FROM ONLY from the SELECT statement.
    • Unlike the SELECT statement, the RETURN statement has a special purpose in code blocks, functions and transactions for returning the final results.
  • The LIVE SELECT statement

    • Reads data in real-time, as changes are made to the underlying table.
    • When you run the LIVE SELECT it will return a UUID, which is the Live Query Unique ID. The UUID is used to keep track of the various Live Queries you have running and to stop them using the KILL statement.
    • You can either receive the changes in our normal format or use the JSON PATCH format by specifying LIVE SELECT DIFF.

Now we can SELECT end FROM lesson and I’ll see you in the next one.