Modelling time series data in SurrealDB
Let’s explore a practical example using IoT sensor data.
Modelling events
The normal way of modelling data in SurrealDB would be as fields in a record, such as coordinates or location.
Complex record IDs
A more optimised way of working with the sensor_readings table in a time series context would however be using an array-based record ID, otherwise known as a complex record ID.
Storing the same information as a record or as part of the ID might look similar at first glance, but under the hood, it’s optimised for efficient range selection, through the magic of range queries on record IDs. As all values in SurrealDB can be compared and sorted, putting the most crucial information into the record ID itself allows you to query only a range of record IDs instead of a whole table.
This effectively means fewer worries about selecting the right indexes or partitions since the ID field already does that naturally in your data as you scale with the performance of a key-value lookup regardless of size!
This is however not the only way of doing it, you can have the metadata in the ID and sensor data in the record itself, like in the example below. Now the record ID only contains the fields useful in a range query, while the remaining fields are kept in the record itself.
Ids inside ids
The last thing to note here is that we’ve actually been using record IDs inside our complex record IDs! This is known as a record link, which allows us to reduce the fields in our main time series table to only the necessary ones by offloading most of the metadata to connected tables.
In our case we have:
location:Longleat_House, which refers to theLongleat_HouseID in thelocationtable. There we put all the metadata about the location itself such as geo coordinates.sensor:ARF8394AAA, which refers to theARF8394AAAID on thesensortable. There we could put all the metadata about the sensor such as location, firmware, when it was bought and when it needs maintenance.
It’s very easy and performant to get the connected data, since you don’t have to do any table scans for that either since it links directly to a specific record on a specific table! In the example below, with time for the sensor readings set exactly to the hour (2024-08-13T05:00:00), the record ID is easy to predict.
For aggregations, materialised views, and retention-style patterns built on top of these events, see Aggregation queries.