Click here to sign up to SurrealDB Cloud

Show all posts

Show all posts

April 25, 2023

Clustered SurrealDB for 1.0.0-beta9

by Hugh Kaznowski, 7 min read

In this post, I will show you how to set up a distributed SurrealDB cluster that shares a distributed TiKV cluster. This architecture allows you to scale your operations to improve writes and reads and seamlessly continue operations during failures.

Introduction and architecture overview

Users of SurrealDB can pick which Key-Value storage engines they want to use. That means that for single deployment, you can use RocksDB or in-memory storage; for distributed storage, you can use TiKV and FoundationDB.

We will deploy a cluster of TiKV that includes 3 TiKV instances (the KV engine) and 3 PD instances (placement driver, a resource tracking service). In addition to the above configuration, we will deploy three nodes of SurrealDB that will point to their respective KV engines. Typically you would want the SurrealDB instances not tied to individual TiKV instances, but that would require a load balancer - something beyond the scope of this article.


Setting up the environment

Because we need access to 6 machines, we will simplify this setup using LXC - a lightweight Linux container system that makes nodes seem like fully-fledged computers.

An important note: LXC does not play nice with docker. There are ways around that, but I removed docker from my host machine for this example. It's a VM; usually, you wouldn't host this environment this way anyway.

Let's start by running and configuring our first LXC container for usage.

hugh@hugh-VirtualBox:~$ lxc launch ubuntu: lxc-node-tikv-1
Creating lxc-node-tikv-1
Starting lxc-node-tikv-1
hugh@hugh-VirtualBox:~$ lxc exec lxc-node-tikv-1 bash
root@lxc-node-tikv-1:~# apt install openssh-server
root@lxc-node-tikv-1:~# curl --proto '=https' --tlsv1.2 -sSf | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 7088k  100 7088k    0     0  2571k      0  0:00:02  0:00:02 --:--:-- 2570k
WARN: adding root certificate via internet:
You can revoke this by remove /root/.tiup/bin/7b8e153f2e2d0928.root.json
Successfully set mirror to
Detected shell:
Shell profile:  /root/.profile
/root/.profile has been modified to add tiup to PATH
open a new terminal or source /root/.profile to use it
Installed path: /root/.tiup/bin/tiup
Have a try:     tiup playground
root@lxc-node-tikv-1:~# sudo useradd -m hugh
root@lxc-node-tikv-1:~# sudo adduser hugh sudo
Adding user `hugh' to group `sudo' ...
Adding user hugh to group sudo
root@lxc-node-tikv-1:~# sudo passwd hugh
New password:
Retype new password:
passwd: password updated successfully
root@lxc-node-tikv-1:~# sudo visudo # Over here I replaced the sudo entry line with  "%sudo ALL=(ALL) NOPASSWD:ALL ", so added NOPASSWD
root@lxc-node-tikv-1:~# vim /etc/ssh/sshd_config
# Change the following lines
# PasswordAuthentication yes
root@lxc-node-tikv-1:~# source .profile
root@lxc-node-tikv-1:~# tiup cluster
tiup is checking updates for component cluster ...timeout(2s)!
The component `cluster` version  is not installed; downloading from repository.
download 8.44 MiB / 8.44 MiB 100.00% 396.61 MiB/s
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster
Deploy a TiDB cluster for production
root@lxc-node-tikv-1:~# tiup update --self && tiup update cluster
download 6.92 MiB / 6.92 MiB 100.00% 171.31 MiB/s
Updated successfully!
component cluster version v1.11.3 is already installed
Updated successfully!
root@lxc-node-tikv-1:~# tiup cluster template > topology.yaml
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster template

Great! We now have our initial node ready. We need to modify our topology file to reflect the actual topology we will have. Here is the sample I have from editing the topology.yaml we just exported.

# # Global variables are applied to all deployments and used as the default value of
# # the deployments if a specific deployment value is missing.
  # # The user who runs the tidb cluster.
  user:  "hugh "
  # # group is used to specify the group name the user belong to if it's not the same as user.
  # group:  "tidb "
  # # SSH port of servers in the managed cluster.
  ssh_port: 22
  # # Storage directory for cluster deployment files, startup scripts, and configuration files.
  deploy_dir:  "/tidb-deploy "
  # # TiDB Cluster data storage directory
  data_dir:  "/tidb-data "
  arch:  "amd64 "

# # Monitored variables are applied to all the machines.
  # # The communication port for reporting system information of each node in the TiDB cluster.
  node_exporter_port: 9100
  # # Blackbox_exporter communication port, used for TiDB cluster port monitoring.
  blackbox_exporter_port: 9115

# # Server configs are used to specify the configuration of PD Servers.
  # # The ip address of the PD Server.
  - host: lxc-node-pd-1
  - host: lxc-node-pd-2
  - host: lxc-node-pd-3

# # Server configs are used to specify the configuration of TiKV Servers.
  # # The ip address of the TiKV Server.
  - host: lxc-node-tikv-1
  - host: lxc-node-tikv-2
  - host: lxc-node-tikv-3

That is actually my entire topology.yaml file. I removed TiDB and all the monitoring - we aren't using that for this example.

We will create a snapshot from the image to simplify our setup and start the installation. We will then create instances that automatically have SSH, the hugh account with a known password, and a sudo group permission without password authentication. Don't do this in production - this is a highly insecure setup for many reasons.

root@lxc-node-tikv-1:~# shutdown -r 0
hugh@hugh-VirtualBox:~$ lxc snapshot lxc-node-tikv-1 base-installation-tikv
hugh@hugh-VirtualBox:~$ lxc publish lxc-node-tikv-1/base-installation-tikv --alias base-installation-tikv
Instance published with fingerprint: b8841a679a59f98f3c23ba6c8795c84942f19170b4a8c41eb102130467c4cca6
hugh@hugh-VirtualBox:~$ printf  "lxc-node-tikv-2
 lxc-node-pd-3 " | xargs -I % lxc launch base-installation-tikv %
Creating lxc-node-tikv-2
Starting lxc-node-tikv-2
Creating lxc-node-tikv-3
Starting lxc-node-tikv-3
Creating lxc-node-pd-1
Starting lxc-node-pd-1
Creating lxc-node-pd-2
Starting lxc-node-pd-2
Creating lxc-node-pd-3
Starting lxc-node-pd-3

We can now start our cluster from the first node we configured. TiUp will connect to all the other nodes via SSH and password authentication and install the services that way.

hugh@hugh-VirtualBox:~$ lxc exec lxc-node-tikv-1 bash
root@lxc-node-tikv-1:~# source .profile
root@lxc-node-tikv-1:~# tiup cluster deploy tikv-test v6.6.0 ./topology.yaml --user hugh -p
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster deploy tikv-test v6.6.0 ./topology.yaml --user hugh -p
Input SSH password:

+ Detect CPU Arch Name
+ Detect CPU Arch Name
  - Detecting node lxc-node-pd-1 Arch info ... Done
  - Detecting node lxc-node-pd-2 Arch info ... Done
  - Detecting node lxc-node-pd-3 Arch info ... Done
  - Detecting node lxc-node-tikv-1 Arch info ... Done
  - Detecting node lxc-node-tikv-2 Arch info ... Done
  - Detecting node lxc-node-tikv-3 Arch info ... Done

+ Detect CPU OS Name
+ Detect CPU OS Name
  - Detecting node lxc-node-pd-1 OS info ... Done
  - Detecting node lxc-node-pd-2 OS info ... Done
  - Detecting node lxc-node-pd-3 OS info ... Done
  - Detecting node lxc-node-tikv-1 OS info ... Done
  - Detecting node lxc-node-tikv-2 OS info ... Done
  - Detecting node lxc-node-tikv-3 OS info ... Done
Please confirm your topology:
Cluster type:    tidb
Cluster name:    tikv-test
Cluster version: v6.6.0
Role  Host             Ports        OS/Arch       Directories
----  ----             -----        -------       -----------
pd    lxc-node-pd-1    2379/2380    linux/x86_64  /tidb-deploy/pd-2379,/tidb-data/pd-2379
pd    lxc-node-pd-2    2379/2380    linux/x86_64  /tidb-deploy/pd-2379,/tidb-data/pd-2379
pd    lxc-node-pd-3    2379/2380    linux/x86_64  /tidb-deploy/pd-2379,/tidb-data/pd-2379
tikv  lxc-node-tikv-1  20160/20180  linux/x86_64  /tidb-deploy/tikv-20160,/tidb-data/tikv-20160
tikv  lxc-node-tikv-2  20160/20180  linux/x86_64  /tidb-deploy/tikv-20160,/tidb-data/tikv-20160
tikv  lxc-node-tikv-3  20160/20180  linux/x86_64  /tidb-deploy/tikv-20160,/tidb-data/tikv-20160
    1. If the topology is not what you expected, check your yaml file.
    2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: (default=N) y
Cluster `tikv-test` deployed successfully, you can start it with command: `tiup cluster start tikv-test --init`
root@lxc-node-tikv-1:~# tiup cluster start tikv-test --init
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster start tikv-test --init
Starting cluster tikv-test...
+ [ Serial ] - SSHKeySet: privateKey=/root/.tiup/storage/cluster/clusters/tikv-test/ssh/id_rsa, publicKey=/root/.tiup/storage/cluster/clusters/tikv-test/ssh/
+ [Parallel] - UserSSH: user=hugh, host=lxc-node-tikv-2
+ [Parallel] - UserSSH: user=hugh, host=lxc-node-pd-3
+ [Parallel] - UserSSH: user=hugh, host=lxc-node-pd-2
+ [Parallel] - UserSSH: user=hugh, host=lxc-node-tikv-3
+ [Parallel] - UserSSH: user=hugh, host=lxc-node-tikv-1
+ [Parallel] - UserSSH: user=hugh, host=lxc-node-pd-1
+ [ Serial ] - StartCluster
Started cluster `tikv-test` successfully
The root password of TiDB database has been changed.
The new password is: 'JuEzYp59+8@$20T_3K'.
Copy and record it to somewhere safe, it is only displayed once, and will not be stored.
The generated password can NOT be get and shown again.

At this point, you should have a running TiKV cluster. All that remains is to put SurrealDB instances on the PD nodes. I will demonstrate this only for a single PD node, as the rest are identical.

root@lxc-node-pd-1:~# curl --proto '=https' --tlsv1.2 -sSf | sh -s -- --nightly

 .d8888b.                                             888 8888888b.  888888b.
d88P  Y88b                                            888 888  'Y88b 888  '88b
Y88b.                                                 888 888    888 888  .88P
 'Y888b.   888  888 888d888 888d888  .d88b.   8888b.  888 888    888 8888888K.
    'Y88b. 888  888 888P'   888P'   d8P  Y8b     '88b 888 888    888 888  'Y88b
      '888 888  888 888     888     88888888 .d888888 888 888    888 888    888
Y88b  d88P Y88b 888 888     888     Y8b.     888  888 888 888  .d88P 888   d88P
 'Y8888P'   'Y88888 888     888      'Y8888  'Y888888 888 8888888P'  8888888P'

Fetching the latest database version...
Fetching the host system architecture...
Installing surreal-nightly for linux-amd64...

SurrealDB successfully installed in:

To ensure that surreal is in your $PATH run:
Or to move the binary to --nightly run:
  sudo mv /root/.surrealdb/surreal --nightly

To see the command-line options run:
  surreal help
To start an in-memory database server run:
  surreal start --log debug --user root --pass root memory
For help with getting started visit:
root@lxc-node-pd-1:~# PATH=/root/.surrealdb:$PATH
root@lxc-node-pd-1:~# surreal sql --ns testns --db testdb -u root -p root --conn tikv://lxc-node-pd-1:2379
testns/testdb> create person:hugh content {name:'test'}
[{ "id ": "person:hugh ", "name ": "test "}]
testns/testdb> select * from person
[{ "id ": "person:hugh ", "name ": "test "}]

Under a real scenario, you would have the SurrealDB nodes separate from the PD nodes, and the connections would be load balanced across the entire PD node pool.


As you can see, it is possible to set up SurrealDB in a cluster so that writes and reads can scale. Failure of a single node would have minimal disruption to the rest of the work while keeping your data intact. Backups can be performed against the TiKV cluster to ensure you can recover in the event of serious failures.

Hopefully, you found this guide helpful, and I look forward to hearing what you get up to with it!

The ultimate beginners guide to databases

April 21, 2023

The ultimate beginners guide to databases

It is our belief that developers should be able to build secure, modern, collaborative applications without needing to build complicated backend APIs and database layers, and without being forced into using a single data model or cloud platform.

All About SurrealQL

April 25, 2023

All About SurrealQL

This week's SurrealDB Stream focused on SurrealQL with co-founder Tobie Morgan Hitchcock, Data Evangelist Alexander Fridriksson and Software Engineer Micha de Vries: Why is SurrealQL a SQL-like language vs a custom language like MongoQL or Cypher?

To stay up-to-date with new blog articles, future product releases, and documentation updates, subscribe to our email newsletter below, follow us on Twitter, or follow us on Dev.

Get updates