RELATE
statementThe RELATE
statement can be used to generate graph edges between two records in the database. This allows you to traverse related records efficiently without needing to pull data from multiple tables and merging that data together using SQL JOINs.
Edges created using the RELATE statement are nearly identical to tables created using other statements, and can contain data. The key differences are that:
in
and out
, which specify the directions of the relationships. These cannot be modified in schema declarations except to specify that they must be of a certain record type or to add assertions.Otherwise, edge tables behave like normal tables in terms of updating, defining a schema or indexes.
Another option for connecting data is using record links. Record links consist of a field with record IDs that serve as uni-directional links. The key differences are that graph relations have the following benefits over record links:
SurrealQL SyntaxRELATE [ ONLY ] @from_record -> @table -> @to_record [ CONTENT @value | SET @field = @value ... ] [ RETURN NONE | RETURN BEFORE | RETURN AFTER | RETURN DIFF | RETURN @statement_param, ... | RETURN VALUE @statement_param ] [ TIMEOUT @duration ] [ PARALLEL ] ;
Note
RELATE
will create a relation regardless of whether the records to relate to exist or not. As such, it is advisable to create the records you want to relate to before usingRELATE
, or to at least ensure that they exist before making a query on the relation. If the records to relate to don’t exist, a query on the relation will still work but will return an empty array. To override this behaviour and return an error if no records exist to relate, you can use aDEFINE TABLE
statement that includes theENFORCED
keyword.
The following query shows the basic structure of the RELATE
statement, which creates a relationship between a record in the person
table and a record in the article
table.
CREATE person:aristotle, article:on_sleep_and_sleeplessness; RELATE person:aristotle->wrote->article:on_sleep_and_sleeplessness;
Response[ { id: wrote:bpbrj5kd7smu3ahlf55r, in: person:aristotle, out: article:on_sleep_and_sleeplessness } ]
There is no relationship information stored in either the person
or article
table.
SELECT * FROM person, article;
Response[ { id: person:aristotle }, { id: article:on_sleep_and_sleeplessness } ]
Instead, an edge table (in this case a table called wrote
) stores the relationship information.
SELECT * FROM wrote;
The structure in -> id -> out
mirrors the record IDs from the RELATE
statement, with the addition of the automatically generated ID for the wrote
edge table.
Response[ { id: wrote:bpbrj5kd7smu3ahlf55r, in: person:aristotle, out: article:on_sleep_and_sleeplessness } ]
The same structure can be used in a SELECT
query, as well as directly from a record ID.
-- Aristotle's id and the articles he wrote SELECT id, ->wrote->article FROM person:aristotle; -- Every `person`'s id and written articles -- Same output as above as the database has a single `person` record SELECT id, ->wrote->article FROM person; -- Directly follow the path from Aristotle to his written articles RETURN person:aristotle->wrote->article;
Response-------- Query -------- [ { "->wrote": { "->article": [ article:on_sleep_and_sleeplessness ] }, id: person:aristotle } ] -------- Query -------- [ article:on_sleep_and_sleeplessness ]
By default, the edge table gets created as a schemaless table when you execute the RELATE
statement. You can make the table schemafull by defining a schema.
A common use case is to make sure only unique relationships get created. You can do that by defining an index.
DEFINE INDEX unique_relationships ON TABLE wrote COLUMNS in, out UNIQUE;
As edge tables are bi-directional by default, there is nothing stopping a query like the following in which an article writes a person instead of the other way around.
RELATE article:on_sleep_and_sleeplessness->wrote->person:aristotle;
To enforce unidirectional relationships, you can restrict the type definition using a DEFINE FIELD
definition.
DEFINE FIELD in ON TABLE wrote TYPE record<person>; DEFINE FIELD out ON TABLE wrote TYPE record<article>;
A RELATE
statement that involves an array instead of a single id will not fail. Instead, it will create a graph edge for each record in the array.
INSERT INTO cat (id) VALUES ("mr_meow"), ("mrs_meow"), ("kitten"); RELATE [cat:mr_meow, cat:mrs_meow]->parent_of->cat:kitten;
Response[ { id: parent_of:uahudi4qr68k640fcjbg, in: cat:mr_meow, out: cat:kitten }, { id: parent_of:hi79yfazjppv8b3kyi36, in: cat:mrs_meow, out: cat:kitten } ]
Similarly, a RELATE
statement that involves two arrays will return a number of graph edges equal to their product (2 * 2 in this case):
CREATE cat:kitten2; RELATE [cat:mr_meow, cat:mrs_meow]->parent_of->[cat:kitten, cat:kitten2];
Response[ { id: parent_of:ysbab20nv5568ogba6ns, in: cat:mr_meow, out: cat:kitten }, { id: parent_of:0ltm6xr94pkblyxf0m6c, in: cat:mr_meow, out: cat:kitten2 }, { id: parent_of:71cfl0nvj5frve0r1npv, in: cat:mrs_meow, out: cat:kitten }, { id: parent_of:4gbid7nzo6cwr1t8k090, in: cat:mrs_meow, out: cat:kitten2 } ]
SET
and CONTENT
Graph edges are standalone tables that can hold other fields besides the default in
, out
, and id
. These can be added during a RELATE
statement or during an UPDATE
in the same manner as any other SurrealDB table.
Let’s look at the two ways you can add record data in the RELATE
statement. Both of these queries will produce the same result.
RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET metadata.time_written = time::now(), metadata.location = "Tallinn"; RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm CONTENT { metadata: { time_written: time::now(), location: "Tallinn" } };
Response[ { id: wrote:rva8hentypdu8lcgwjmf, in: person:l19zjikkw1p1h9o6ixrg, metadata: { location: 'Tallinn', time_written: d'2024-11-26T01:52:01.169Z' }, out: article:8nkk6uj4yprt49z7y3zm } ]
Here is an example of the graph edge being updated in the same way as any other SurrealDB record:
-- Add a small synopsis composed of the table name and article ID UPDATE wrote SET metadata.description = meta::tb(out) + ' written by ' + <string>in;
Response[ { id: wrote:k9d8ynbfxgb8jqjv2ob5, in: person:l19zjikkw1p1h9o6ixrg, metadata: { description: 'article written by person:l19zjikkw1p1h9o6ixrg', location: 'Tallinn', time_written: d'2024-11-26T01:53:51.350Z' }, out: article:8nkk6uj4yprt49z7y3zm } ]
CONTENT
and SET
Available since: v1.5.0
You can also pass variables in the CONTENT
block. This is useful when you want to pass a variable that is not a record ID.
LET $time = time::now(); RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm CONTENT { time: { written: $time } };
Response{ "id": "wrote:ctwsll49k37a7rmqz9rr", "in": "person:l19zjikkw1p1h9o6ixrg", "out": "article:8nkk6uj4yprt49z7y3zm", "time": { "written": "2021-09-29T14:00:00Z" } }
Below is an example of how you can pass a variable in the SET
block:
LET $time = time::now(); RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = $time;
Response{ "id": "wrote:ctwsll49k37a7rmqz9rr", "in": "person:l19zjikkw1p1h9o6ixrg", "out": "article:8nkk6uj4yprt49z7y3zm", "time": { "written": "2021-09-29T14:00:00Z" } }
ONLY
keywordUsing the ONLY keyword, just an object for the relation in question will be returned. This, instead of an array with a single object.
RELATE ONLY person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm;
Response{ id: wrote:k9f1rqn3oikolr1560u3, in: person:l19zjikkw1p1h9o6ixrg, out: article:8nkk6uj4yprt49z7y3zm }
LET
parameters in RELATE statementsYou can also use parameters to specify the record IDs.
-- These two statements store the result of the subquery in a parameter -- The subquery returns an array of IDs LET $person = (SELECT VALUE id FROM person); LET $article = (SELECT VALUE id FROM article); -- This statement creates a relationship record for every combination of Record IDs -- Such that if we have 10 records each in the person and article table -- We get 100 records in the wrote edge table (10*10 = 100) -- In this case it would mean that each article would have 10 authors RELATE $person->wrote->$article SET time.written = time::now();
RETURN
clauseBy default, the relate statement returns the record value once the changes have been made. To change the return value of each record, specify a RETURN clause, specifying either NONE
, BEFORE
, AFTER
, DIFF
, or a comma-separated list of specific fields to return.
-- Don't return any result RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN NONE; -- Return the changeset diff RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN DIFF; -- Return the record before changes were applied RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN BEFORE; -- Return the record after changes were applied (the default) RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN AFTER; -- Return a specific field only from the updated records RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN time; -- Return only the value of a specific field without the field name RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN VALUE time;
TIMEOUT
clauseAdding the TIMEOUT
keyword to specify a timeout duration for the statement can be useful when processing a large result set with many interconnected records. If the statement continues beyond this duration, then the transaction will fail, and the statement will return an error.
-- Cancel this conditional filtering based on graph edge properties -- if not finished within 5 seconds SELECT * FROM person WHERE ->knows->person->(knows WHERE influencer = true) TIMEOUT 5s;
Using a TIMEOUT
is particularly useful when experimenting with complex queries with an extent that is difficult to imagine, especially if the query is recursive.
PARALLEL
clauseWhen processing a large result set with many interconnected records, it may be desirable to use the PARALLEL
keyword to signify that edges and remote records should be fetched and processed concurrently, rather than sequentially. This can be useful when you want to process a large result set in a short amount of time.
-- Fetch and process the person, purchased and product targets in parallel -- Select every product that was purchased by a person that purchased a product that person tobie also purchased SELECT ->purchased->product<-purchased<-person->purchased->product FROM person:tobie PARALLEL;
You can also delete graph edges between two records in the database by using the DELETE statement.
For example the graph edge below:
RELATE person:tobie->bought->product:iphone;
Response[ { id: bought:ctwsll49k37a7rmqz9rr, in: person:tobie, out: product:iphone } ]
Can be deleted by:
DELETE person:tobie->bought WHERE out=product:iphone RETURN BEFORE;
As mentioned above, a graph edge will also automatically be deleted if it is no longer connected to a record at both in
and out
.
-- Create three people CREATE person:one, person:two, person:three; -- And a love triangle involving them all RELATE person:one ->likes->person:two; RELATE person:two ->likes->person:three; RELATE person:three->likes->person:one; -- Person two moves to Venus permanently, so delete DELETE person:two; -- Only one `likes` relationship is left SELECT * FROM likes;
Output[ { id: likes:55szjin5yfqwl4sbmy1f, in: person:three, out: person:one } ]
As mentioned at the top of the page, RELATE
can be used for records that do not yet exist. While this behaviour can be overridden by using the ENFORCED
keyword, it can be useful in certain situations.
For example, the VALUE
clause inside a DEFINE FIELD
statement is calculated every time a record is altered (that is, every time it is created or updated). If this value depends on a graph edge, creating the record first will cause VALUE
to calculate it based on a nonexistent path.
In the following example, a house
table has a field called has_road_access
that depends on whether any ->has_road
paths return an output that is not empty. Meanwhile, the city has a new road under construction but no houses are present and their details have not been set yet.
-- Returns true if $this->has_road path is not empty DEFINE FIELD has_road_access ON TABLE house VALUE !!$this->has_road->road; CREATE road SET name = "Dalhurst Way", length = 10.5;
As the addresses of the upcoming houses have been decided, the ->has_road
path can be set ahead of time by giving the house
records an ID based on their exact address.
LET $road = SELECT * FROM ONLY road WHERE name = "Dalhurst Way" LIMIT 1; RELATE [ house:[218, "Dalhurst Way"], house:[222, "Dalhurst Way"], house:[226, "Dalhurst Way"], ]->has_road->$road;
Later on, two new houses are completed in the city and registered in the database. As the path to house:[218, "Dalhurst Way"]
has already been set up, the has_road_access
field will evaluate to true
, while the other house in the middle of nowhere will evaluate to false
.
CREATE house:[218, "Dalhurst Way"] SET floors = 2, bedrooms = 5; CREATE house:[0, "Middle of nowhere"] SET floors = 4, bedrooms = 12;
-------- Query -------- [ { bedrooms: 5, floors: 2, has_road_access: true, id: house:[ 218, 'Dalhurst Way' ] } ] -------- Query -------- [ { bedrooms: 12, floors: 4, has_road_access: false, id: house:[ 0, 'Middle of nowhere' ] } ]
For the questions below, each of the queries will give you largely the same answer. Note that whether ->
and <-
are parsed as in
or out
depends on their direction in relation to the graph edge wrote
. An arrow pointing towards wrote
corresponds to in
, and vice versa.
The following examples show how to make similar queries in a number of different ways, in the context of a database with one person who wrote two articles.
CREATE person:aristotle, article:on_sleep_and_sleeplessness, article:on_dreams; RELATE person:aristotle->wrote->[ article:on_sleep_and_sleeplessness, article:on_dreams ] // Written sometime around the year 330 BC SET time_written = d"-0330-01-01";
Who wrote the articles?
-- All queries lead to `person:artistotle` twice, -- via different paths and thus different field names -- and/or structure -- Directly from the `wrote` table SELECT in FROM wrote; -- From a single `person` record SELECT ->wrote.in FROM person; SELECT ->wrote<-person FROM person; -- From two `article` records SELECT <-wrote.in FROM article; SELECT <-wrote<-person FROM article;
Which articles did the person write?
SELECT out FROM wrote; SELECT ->wrote.out FROM person; SELECT ->wrote->article FROM person; SELECT <-wrote.out FROM article; SELECT <-wrote->article FROM article;
When was the article written?
SELECT time_written FROM wrote; SELECT ->wrote.time_written as time_written FROM person; SELECT <-wrote.time_written as time_written FROM article;
For a more complicated query like the one below you can use a simple rule of thumb: Place the subject in front of the graph selection, then read it backward.
-- This query SELECT ->purchased->product<-purchased<-person->purchased->product FROM person:tobie -- Then becomes person:tobie->purchased->product<-purchased<-person->purchased->product SELECT
Reading this backwards then makes more sense:
Select every product that was purchased by a person who purchased a product that was also purchased by person Tobie.
Alternatively, you can break it down into steps over multiple lines.
-- Starting with Tobie person:tobie -- move on to his purchased products ->purchased->product -- that were also purchased by persons... <-purchased<-person -- what are all of those persons' purchased products? ->purchased->product
Putting it all together it would be: based on all the products Tobie purchased, which person also purchased those products and what did they purchase? This sort of query could be used on a social network site to recommend to the user person:tobie
a list of people that have similar interests.
Parentheses can be added at any step of a graph query to refine the logic, such as filtering relations based on specific conditions using the WHERE
clause.
For example, suppose we want to limit the query to only take recent purchases into account. We can filter purchased
graph edge to only include purchases made in last 3 weeks:
-- Select products purchased by people in the last 3 weeks who have purchased the same products that tobie purchased SELECT ->purchased->product <-purchased<-person->(purchased WHERE created_at > time::now() - 3w) ->purchased->product FROM person:tobie;
If the purchased
graph table can lead to both a product
or a subscription
, they can both be added to the query.
SELECT ->purchased->(product, subscription) <-purchased<-person ->purchased->(product, subscription) FROM person:tobie;
The ?
wildcard operator can also be used to search for any and all linked records. The following query will allow purchased product
, subscription
, insurance
, or any other linked records to show up.
SELECT ->purchased->(?) <-purchased<-person ->purchased->(?) FROM person:tobie;
The ?
operator on its own can thus be used to see all of the relations that a record has.
CREATE person:hermann_hesse, person:abigail, city:calw, book:demian; RELATE person:hermann_hesse->wrote->book:demian SET written_in = d'1919-01-01'; RELATE person:hermann_hesse->born_in->city:calw; RELATE person:abigail->likes->person:hermann_hesse; SELECT -- all tables in which the record is at `in` ->(?).* AS what_hesse_did, -- all tables in which the record is at `out` <-(?).* AS what_others_did_to_hesse FROM person:hermann_hesse;
Output[ { what_hesse_did: [ { id: born_in:k3adylof24a2r5kio8l5, in: person:hermann_hesse, out: city:calw }, { id: wrote:ncbo9w0d8t3xd7lvl4dx, in: person:hermann_hesse, out: book:demian, written_in: d'1919-01-01T00:00:00Z' } ], what_others_did_to_hesse: [ { id: likes:6gubmldm14gzasoyypay, in: person:abigail, out: person:hermann_hesse } ] } ]
Parentheses can be used at each point of a graph query. The example below includes person
records (authors) connected to book
records by the wrote
table. As both the person
and book
tables have fields that can be useful when filtering, they can be isolated with parentheses at this point of the graph query in order to filter using the WHERE
clause.
CREATE person:j_r_r_tolkien SET name = "J.R.R. Tolkien", born = d'1891-01-03'; -- Very approximate date of birth CREATE person:plato SET name = "Plato", born = "-0428-06-01"; CREATE book:fotr SET name = "The Fellowship of the Ring"; CREATE book:republic SET name = "The Republic", original_name = "Πολιτεία"; RELATE person:j_r_r_tolkien->wrote->book:fotr SET written_at = "North Oxford"; RELATE person:plato->wrote->book:republic SET written_at = "Athens"; SELECT name, -- Isolate 'wrote' to use WHERE ->(wrote WHERE written_at = "Athens")->book.* AS books_written_in_athens FROM person; SELECT name, -- Isolate 'book' to use WHERE ->wrote->(book WHERE "Ring" IN name).* AS books_about_rings FROM person;
Output-------- Query -------- [ { books_written_in_athens: [], name: 'J.R.R. Tolkien' }, { books_written_in_athens: [ { id: book:republic, name: 'The Republic', original_name: 'Πολιτεία' } ], name: 'Plato' } ] -------- Query -------- [ { books_about_rings: [ { id: book:fotr, name: 'The Fellowship of the Ring' } ], name: 'J.R.R. Tolkien' }, { books_about_rings: [], name: 'Plato' } ]
As of SurrealDB 2.0, destructuring can also be used to pick and choose which fields to access inside a graph query. The following query will return the same output as above, except that original_name: 'Πολιτεία'
will no longer show up.
SELECT name, ->(wrote WHERE written_at = "Athens")->book.{ name, id } AS books_written_in_athens FROM person;
All of the queries up to now have been clear about what sort of record is found at the in
and out
fields: in
is the record that is doing something, while out
is the record that has something done to it:
person
who writes an article
: the person writes, the article is written.person
who purchases a product
: the person purchases, the product is purchased.However, sometimes a relation is such that it is impossible to determine which record is located at the in
part of a graph table, and which is located at the out
part. This is the case when a relationship is truly bidirectional and equal, such as a friendship, marriage, or sister cities:
CREATE city:calgary, city:daejeon; RELATE city:calgary->sister_of->city:daejeon;
This relation could just as well have been established with the statement RELATE city:daejeon->sister_of->city:calgary
.
In such a case, a query on the relationship makes it appear as if one city has a twin city but the other does not.
SELECT id, ->sister_of->city AS sister_cities FROM city;
Response[ { id: city:calgary, sister_cities: [ city:daejeon ] }, { id: city:daejeon, sister_cities: [] } ]
To solve this, we can use the <->
operator instead of ->
. Using <->
will access both the in
and out
fields, instead of just one.
SELECT id, <->sister_of<->city AS sister_cities FROM city;
This brings up another issue in which a city now appears to be a sister city of itself.
[ { id: city:calgary, sister_cities: [ city:calgary, city:daejeon ] }, { id: city:daejeon, sister_cities: [ city:calgary, city:daejeon ] } ]
Here we can use the array::complement
function to return only items from one array that are not present in another array.
SELECT id, array::complement(<->sister_of<->city, [id]) AS sister_cities FROM city;
Response[ { id: city:calgary, sister_cities: [ city:daejeon ] }, { id: city:daejeon, sister_cities: [ city:calgary ] } ]
Adding a unique key is a good practice for this sort of relation, as it will prevent it from being created twice. This can be done by defining a field as a unique key based on the ordered record IDs involved, followed by a DEFINE INDEX
statement.
DEFINE FIELD key ON TABLE sister_of VALUE <string>array::sort([in, out]); DEFINE INDEX only_one_sister_city ON TABLE sister_of FIELDS key UNIQUE;
With the index in place, a relation set from one record to the other now cannot be created a second time.
RELATE city:calgary->sister_of->city:daejeon; -- OK RELATE city:daejeon->sister_of->city:calgary; -- "Database index `only_one_sister_city` already contains '[city:calgary, city:daejeon]', with record `sister_of:npab0uoxogmrvpwsvfoa`"
in
and out
fields of a relationAs mentioned above, the in
and out
fields of a graph table are mandatory but can be modified to specify their record type or make assertions.
Thus, the following field declarations will work:
DEFINE FIELD in ON TABLE wrote TYPE record<author>; DEFINE FIELD out ON TABLE wrote TYPE record<book>;
But any attempt to outright redefine the in
or out
fields as a different type will be ignored.
DEFINE FIELD in ON TABLE wrote TYPE string; DEFINE FIELD out ON TABLE wrote TYPE int;
An example of an assertion on one of the fields of a record table for a library which is not yet ready to handle non-English books:
DEFINE FIELD out ON TABLE wrote TYPE record<book> ASSERT $value.language = "English"; CREATE book:demian SET title = "Demian. Die Geschichte von Emil Sinclairs Jugend", language = "German"; CREATE author:hesse SET name = "Hermann Hesse"; RELATE author:hesse->wrote->book:demian;
Output"Found book:demian for field `out`, with record `wrote:le3ntnyv2lb943km9gxk`, but field must conform to: $value.language = 'English'"
Using an alias is a common practice in both regular and relation queries in SurrealDB to make output more readable and collapse nested structures. You can create an alias using the AS
clause.
CREATE cat:one, cat:two, cat:three; RELATE cat:one->friends_with->cat:two; RELATE cat:two->friends_with->cat:three; SELECT ->friends_with->cat->friends_with->cat FROM cat:one; -- create an alias for the result using the `AS` clause. SELECT ->friends_with->cat->friends_with->cat AS friends_of_friends FROM cat:one;
// Output without alias { "->friends_with": { "->cat": { "->friends_with": { "->cat": [ cat:three ] } } } } // Output with alias { friends_of_friends: [ cat:three ] }
However, an alias might not be preferred in a case where you have multiple graph queries that resolve to the fields of a large nested structure. Take the following data for example:
CREATE country:usa SET name = "USA"; CREATE state:pennsylvania SET population = 12970000; CREATE state:michigan SET population = 10030000; CREATE city:philadelphia, city:pittsburgh, city:detroit, city:grand_rapids; RELATE country:usa->contains->[state:pennsylvania, state:michigan]; RELATE state:pennsylvania->contains->[city:philadelphia, city:pittsburgh]; RELATE state:michigan->contains->[city:detroit, city:grand_rapids];
A query on the states and cities of these records using aliases would return the data in a structure remade to fit the aliases declared in the query.
SELECT name, ->contains->state AS states, ->contains->state->contains->city AS cities FROM country:usa;
Output[ { cities: [ city:philadelphia, city:pittsburgh, city:grand_rapids, city:detroit ], name: 'USA', states: [ state:pennsylvania, state:michigan ] } ]
However, opting to not use an alias will return the original graph structure which makes the levels of depth of the query clearer. In addition, the population
field is clearly the population for the states.
SELECT id, ->contains->state.id, ->contains->state.population, ->contains->state->contains->city.id FROM country:usa;
If using SurrealDB versions 2.0 and above, destructuring syntax can be used to reduce some typing. Here is the same query as the last using destructuring syntax instead of one line for each field.
SELECT id, -- access id and population on a single line ->contains->state.{id, population}, ->contains->state->contains->city.id FROM country:usa;
Output[ { "->contains": { "->state": { "->contains": { "->city": { id: [ city:philadelphia, city:pittsburgh, city:grand_rapids, city:detroit ] } }, id: [ state:pennsylvania, state:michigan ], population: [ 12970000, 10030000 ] } }, id: country:usa } ]
As the query that uses aliases does not maintain the original graph structure, adding population
would require clever renaming such as ->contains->state.population AS state_populations
to make it clear that the numbers represent state and not city populations.
Being able to set fields on graph tables opens up a large variety of custom query methods, one of which is explored here.
Imagine a database that holds detailed information on the relations between NPCs in a game that are made to be as realistic as possible. Two of the characters have a rocky past but finally end up married. During this period, we might have tracked their relationship by adding and removing graph edges between the two of them as they move from a stage of being friends, to dating, to hating each other, to finally ending up married.
CREATE person:one, person:two; -- These three relations would end up deleted RELATE person:one->friends_with->person:two; RELATE person:one->dating->person:two; RELATE person:one->hates->person:two; -- Finally this would be the graph edge connecting the two RELATE person:one->married->person:two;
This works well to track the current state of the relationship, but creating a more general table such as knows
along with a number of fields can be a better method to track the changing relationship over time. The following shows the relationship between the two person
records, along with a third record called person:three
who went to the same school and once dated person:one
.
CREATE person:one, person:two, person:three; RELATE person:one->knows->person:two SET has_been_friends = true, has_dated = true, has_hated = true, married_to = true; RELATE person:one->knows->person:three SET same_high_school = true, has_dated = true;
With these fields in place, it is possible to use a WHERE
clause to do refined searches on relationships of a certain type.
SELECT ->knows->person AS knows, ->knows[WHERE has_dated]->person AS has_dated, ->knows[WHERE same_high_school AND has_dated]->person AS dated_and_same_school FROM person:one;
Response[ { dated_and_same_school: [ person:three ], has_dated: [ person:two, person:three ], knows: [ person:two, person:three ] } ]
Because the WHERE
clause simply checks for truthiness (whether a value is present and not empty), these fields do not necessarily need to be booleans and can even be complex objects.
RELATE person:one->knows->person:two SET same_high_school = false, has_been_friends = true, has_dated = { from: d'2020-12-25', to: d'2023-12-25' }, has_hated = { from: d'2023-12-25', to: d'2024-03-01' }, married_to = { since: d'2024-03-01' }; RELATE person:one->knows->person:three SET same_high_school = true, has_dated = { from: d'2019-09-10', to: d'2020-12-31' };
With these objects, a jealous person:two
could do a check on person:one
to see how many relationships with has_dated
have an end time that overlaps with the has_dated
period of person:one
and person:two
.
SELECT id, ->knows[WHERE same_high_school AND has_dated.to > d'2020-12-25']->person FROM person:one;
Available since: v2.1.0
Graph edges can also be queried recursively. For a full explanation of this syntax, see the page on recursive paths.
Take the following example which creates five cities, each of which is connected to the next by some type of road of random length.
CREATE |city:1..5| SET name = <string>id.id() + 'ville'; FOR $pair IN (<array>(1..=5)).windows(2) { LET $city1 = type::thing("city", $pair[0]); LET $city2 = type::thing("city", $pair[1]); RELATE $city1->to->$city2 SET type = rand::enum(["train", "road", "bike path"]), distance = <int>(rand::float() * 100).ceil() };
While it is possible to manually move three levels down this road network, it involves a good deal of manual typing.
SELECT ->to->city->to->city->to->city AS fourth_city FROM city:1;
Response[ { fourth_city: [ city:4 ] } ]
This can be replaced by a @
to refer to the current record, followed by .{3}
to represent three levels down the to
graph edge. A level between 1 and 256 can be specified here.
SELECT @.{3}->to->city AS fourth_city FROM city:1;
A traditional query to show the final road info from city:1
to the city three stops away would look like this.
SELECT ->to->city->to->city->to.* AS third_journey FROM city:1;
Response[ { fourth_city: [ [ { distance: 80, id: to:sw2pery99jomfhibzfrh, in: city:3, out: city:4, type: 'train' } ] ] } ]
To use the same query recursively, wrap the part that must be repeated (->to->city
) inside parentheses. This will ensure that the .{2}
part of the query only repeats ->to->city
twice, and not the final ->to.*
portion.
SELECT @.{2}(->to->city)->to.* AS third_journey FROM city:1;
A range can be added inside the {}
braces. The following query that uses a range of 1 to 20 will follow the ->to->city
path up to 20 times, but will stop at the 5th and final depth because the next level returns an empty array.
city:1.{1..20}->to->city;
Response[ city:5 ]
Ranges can be followed with the destructuring operator to collect fields on each depth, returning them in a single response. The following query goes five depths down the to
graph table, returning each city and road along the way.
SELECT @.{1..5}.{ id, next_roads: ->to.*, next_cities: ->to->city } FROM city;
Response[ { id: city:1, next_cities: [ city:2 ], next_roads: [ { distance: 33, id: to:bl6i9djau0pg24pqrwd9, in: city:1, out: city:2, type: 'road' } ] }, { id: city:2, next_cities: [ city:3 ], next_roads: [ { distance: 45, id: to:ybugfnlzv6kcrkaj49ig, in: city:2, out: city:3, type: 'road' } ] }, { id: city:3, next_cities: [ city:4 ], next_roads: [ { distance: 80, id: to:sw2pery99jomfhibzfrh, in: city:3, out: city:4, type: 'train' } ] }, { id: city:4, next_cities: [ city:5 ], next_roads: [ { distance: 29, id: to:42hlspf4z5lpqceyv68p, in: city:4, out: city:5, type: 'train' } ] }, { id: city:5, next_cities: [], next_roads: [] } ]
As noted above, a TIMEOUT
can be set for queries that may be computationally expensive. This is particularly useful when experimenting with recursive queries, which, if care is not taken, can run all the way to the maximum possible depth of 256.
Take the following example with two person
records that like each other. Following the likes
edge will run until the query recurses 256 times and gives up.
CREATE person:one, person:two; RELATE person:one->likes->person:two; RELATE person:two->likes->person:one; -- Open-ended range person:one.{..}->likes->person;
Response'Exceeded the idiom recursion limit of 256.'
Take the following example in which three person
records of created, each of which likes the other two person
records. A query on the ->likes->person
path shows that the number of records doubles each time.
CREATE |person:1..3|; FOR $person IN (SELECT * FROM person) { LET $others = (SELECT * FROM person WHERE id != $person.id); FOR $other IN $others { RELATE $person->likes->$other; } }; RETURN [ person:1.{2}->likes->person, person:1.{3}->likes->person, person:1.{4}->likes->person ];
Response[ [ person:1, person:2, person:1, person:3 ], [ person:3, person:2, person:1, person:3, person:3, person:2, person:1, person:2 ], [ person:1, person:2, person:1, person:3, person:3, person:2, person:1, person:2, person:1, person:2, person:1, person:3, person:3, person:2, person:1, person:3 ] ]
Since an open-ended range can be specified in a recursive query, this would result in a full 256 attempts to recurse, multiplying the number of results by two each time for a total of 115792089237316195423570985008687907853269984665640564039457584007913129639936 records by the end.
When experimenting with recursive queries, especially open-ended ranges, it is thus recommended to use a timeout.
SELECT @.{..}.{ id, likes: ->likes->person.@ } FROM person TIMEOUT 1s;