Apr 25th, 2023
by Hugh Kaznowski, 7 min read
In this post, I will show you how to set up a distributed SurrealDB cluster that shares a distributed TiKV cluster. This architecture allows you to scale your operations to improve writes and reads and seamlessly continue operations during failures.
Users of SurrealDB can pick which Key-Value storage engines they want to use. That means that for single deployment, you can use RocksDB or in-memory storage; for distributed storage, you can use TiKV and FoundationDB.
We will deploy a cluster of TiKV that includes 3 TiKV instances (the KV engine) and 3 PD instances (placement driver, a resource tracking service). In addition to the above configuration, we will deploy three nodes of SurrealDB that will point to their respective KV engines. Typically you would want the SurrealDB instances not tied to individual TiKV instances, but that would require a load balancer - something beyond the scope of this article.
Because we need access to 6 machines, we will simplify this setup using LXC - a lightweight Linux container system that makes nodes seem like fully-fledged computers.
An important note: LXC does not play nice with docker. There are ways around that, but I removed docker from my host machine for this example. It’s a VM; usually, you wouldn’t host this environment this way anyway.
Let’s start by running and configuring our first LXC container for usage.
hugh@hugh-VirtualBox:~$ lxc launch ubuntu: lxc-node-tikv-1
Creating lxc-node-tikv-1 Starting lxc-node-tikv-1
hugh@hugh-VirtualBox:~$ lxc exec lxc-node-tikv-1 bash root@lxc-node-tikv-1:~# apt install openssh-server ... root@lxc-node-tikv-1:~# curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 7088k 100 7088k 0 0 2571k 0 0:00:02 0:00:02 --:--:-- 2570k WARN: adding root certificate via internet: https://tiup-mirrors.pingcap.com/root.json You can revoke this by remove /root/.tiup/bin/7b8e153f2e2d0928.root.json Successfully set mirror to https://tiup-mirrors.pingcap.com Detected shell: Shell profile: /root/.profile /root/.profile has been modified to add tiup to PATH open a new terminal or source /root/.profile to use it Installed path: /root/.tiup/bin/tiup =============================================== Have a try: tiup playground ===============================================
root@lxc-node-tikv-1:~# sudo useradd -m hugh root@lxc-node-tikv-1:~# sudo adduser hugh sudo
Adding user `hugh' to group `sudo' ... Adding user hugh to group sudo Done.
root@lxc-node-tikv-1:~# sudo passwd hugh
New password: Retype new password: passwd: password updated successfully
root@lxc-node-tikv-1:~# sudo visudo # Over here I replaced the sudo entry line with "%sudo ALL=(ALL) NOPASSWD:ALL ", so added NOPASSWD root@lxc-node-tikv-1:~# vim /etc/ssh/sshd_config # Change the following lines # PasswordAuthentication yes root@lxc-node-tikv-1:~# source .profile root@lxc-node-tikv-1:~# tiup cluster
tiup is checking updates for component cluster ...timeout(2s)! The component `cluster` version is not installed; downloading from repository. download https://tiup-mirrors.pingcap.com/cluster-v1.11.3-linux-amd64.tar.gz 8.44 MiB / 8.44 MiB 100.00% 396.61 MiB/s Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster Deploy a TiDB cluster for production ...
root@lxc-node-tikv-1:~# tiup update --self && tiup update cluster download https://tiup-mirrors.pingcap.com/tiup-v1.11.3-linux-amd64.tar.gz 6.92 MiB / 6.92 MiB 100.00% 171.31 MiB/s Updated successfully! component cluster version v1.11.3 is already installed Updated successfully!
root@lxc-node-tikv-1:~# tiup cluster template > topology.yaml
tiup is checking updates for component cluster ... Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster template
Great! We now have our initial node ready. We need to modify our topology file to reflect the actual topology we will have. Here is the sample I have from editing the topology.yaml
we just exported.
# # Global variables are applied to all deployments and used as the default value of # # the deployments if a specific deployment value is missing. global: # # The user who runs the tidb cluster. user: "hugh " # # group is used to specify the group name the user belong to if it's not the same as user. # group: "tidb " # # SSH port of servers in the managed cluster. ssh_port: 22 # # Storage directory for cluster deployment files, startup scripts, and configuration files. deploy_dir: "/tidb-deploy " # # TiDB Cluster data storage directory data_dir: "/tidb-data " arch: "amd64 " # # Monitored variables are applied to all the machines. monitored: # # The communication port for reporting system information of each node in the TiDB cluster. node_exporter_port: 9100 # # Blackbox_exporter communication port, used for TiDB cluster port monitoring. blackbox_exporter_port: 9115 # # Server configs are used to specify the configuration of PD Servers. pd_servers: # # The ip address of the PD Server. - host: lxc-node-pd-1 - host: lxc-node-pd-2 - host: lxc-node-pd-3 # # Server configs are used to specify the configuration of TiKV Servers. tikv_servers: # # The ip address of the TiKV Server. - host: lxc-node-tikv-1 - host: lxc-node-tikv-2 - host: lxc-node-tikv-3
That is actually my entire topology.yaml file. I removed TiDB and all the monitoring - we aren’t using that for this example.
We will create a snapshot from the image to simplify our setup and start the installation. We will then create instances that automatically have SSH, the hugh account with a known password, and a sudo group permission without password authentication. Don’t do this in production - this is a highly insecure setup for many reasons.
root@lxc-node-tikv-1:~# shutdown -r 0 hugh@hugh-VirtualBox:~$ lxc snapshot lxc-node-tikv-1 base-installation-tikv hugh@hugh-VirtualBox:~$ lxc publish lxc-node-tikv-1/base-installation-tikv --alias base-installation-tikv
Instance published with fingerprint: b8841a679a59f98f3c23ba6c8795c84942f19170b4a8c41eb102130467c4cca6
hugh@hugh-VirtualBox:~$ printf "lxc-node-tikv-2 lxc-node-tikv-3 lxc-node-pd-1 lxc-node-pd-2 lxc-node-pd-3 " | xargs -I % lxc launch base-installation-tikv %
Creating lxc-node-tikv-2 Starting lxc-node-tikv-2 Creating lxc-node-tikv-3 Starting lxc-node-tikv-3 Creating lxc-node-pd-1 Starting lxc-node-pd-1 Creating lxc-node-pd-2 Starting lxc-node-pd-2 Creating lxc-node-pd-3 Starting lxc-node-pd-3
We can now start our cluster from the first node we configured. TiUp will connect to all the other nodes via SSH and password authentication and install the services that way.
hugh@hugh-VirtualBox:~$ lxc exec lxc-node-tikv-1 bash root@lxc-node-tikv-1:~# source .profile root@lxc-node-tikv-1:~# tiup cluster deploy tikv-test v6.6.0 ./topology.yaml --user hugh -p
tiup is checking updates for component cluster ... Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster deploy tikv-test v6.6.0 ./topology.yaml --user hugh -p Input SSH password: + Detect CPU Arch Name + Detect CPU Arch Name - Detecting node lxc-node-pd-1 Arch info ... Done - Detecting node lxc-node-pd-2 Arch info ... Done - Detecting node lxc-node-pd-3 Arch info ... Done - Detecting node lxc-node-tikv-1 Arch info ... Done - Detecting node lxc-node-tikv-2 Arch info ... Done - Detecting node lxc-node-tikv-3 Arch info ... Done + Detect CPU OS Name + Detect CPU OS Name - Detecting node lxc-node-pd-1 OS info ... Done - Detecting node lxc-node-pd-2 OS info ... Done - Detecting node lxc-node-pd-3 OS info ... Done - Detecting node lxc-node-tikv-1 OS info ... Done - Detecting node lxc-node-tikv-2 OS info ... Done - Detecting node lxc-node-tikv-3 OS info ... Done Please confirm your topology: Cluster type: tidb Cluster name: tikv-test Cluster version: v6.6.0 Role Host Ports OS/Arch Directories ---- ---- ----- ------- ----------- pd lxc-node-pd-1 2379/2380 linux/x86_64 /tidb-deploy/pd-2379,/tidb-data/pd-2379 pd lxc-node-pd-2 2379/2380 linux/x86_64 /tidb-deploy/pd-2379,/tidb-data/pd-2379 pd lxc-node-pd-3 2379/2380 linux/x86_64 /tidb-deploy/pd-2379,/tidb-data/pd-2379 tikv lxc-node-tikv-1 20160/20180 linux/x86_64 /tidb-deploy/tikv-20160,/tidb-data/tikv-20160 tikv lxc-node-tikv-2 20160/20180 linux/x86_64 /tidb-deploy/tikv-20160,/tidb-data/tikv-20160 tikv lxc-node-tikv-3 20160/20180 linux/x86_64 /tidb-deploy/tikv-20160,/tidb-data/tikv-20160 Attention: 1. If the topology is not what you expected, check your yaml file. 2. Please confirm there is no port/directory conflicts in same host. Do you want to continue? [y/N]: (default=N) y ... Cluster `tikv-test` deployed successfully, you can start it with command: `tiup cluster start tikv-test --init`
root@lxc-node-tikv-1:~# tiup cluster start tikv-test --init
tiup is checking updates for component cluster ... Starting component `cluster`: /root/.tiup/components/cluster/v1.11.3/tiup-cluster start tikv-test --init Starting cluster tikv-test... + [ Serial ] - SSHKeySet: privateKey=/root/.tiup/storage/cluster/clusters/tikv-test/ssh/id_rsa, publicKey=/root/.tiup/storage/cluster/clusters/tikv-test/ssh/id_rsa.pub + [Parallel] - UserSSH: user=hugh, host=lxc-node-tikv-2 + [Parallel] - UserSSH: user=hugh, host=lxc-node-pd-3 + [Parallel] - UserSSH: user=hugh, host=lxc-node-pd-2 + [Parallel] - UserSSH: user=hugh, host=lxc-node-tikv-3 + [Parallel] - UserSSH: user=hugh, host=lxc-node-tikv-1 + [Parallel] - UserSSH: user=hugh, host=lxc-node-pd-1 + [ Serial ] - StartCluster ... Started cluster `tikv-test` successfully The root password of TiDB database has been changed. The new password is: 'JuEzYp59+8@$20T_3K'. Copy and record it to somewhere safe, it is only displayed once, and will not be stored. The generated password can NOT be get and shown again.
At this point, you should have a running TiKV cluster. All that remains is to put SurrealDB instances on the PD nodes. I will demonstrate this only for a single PD node, as the rest are identical.
root@lxc-node-pd-1:~# curl --proto '=https' --tlsv1.2 -sSf https://install.surrealdb.com | sh -s -- --nightly
.d8888b. 888 8888888b. 888888b. d88P Y88b 888 888 'Y88b 888 '88b Y88b. 888 888 888 888 .88P 'Y888b. 888 888 888d888 888d888 .d88b. 8888b. 888 888 888 8888888K. 'Y88b. 888 888 888P' 888P' d8P Y8b '88b 888 888 888 888 'Y88b '888 888 888 888 888 88888888 .d888888 888 888 888 888 888 Y88b d88P Y88b 888 888 888 Y8b. 888 888 888 888 .d88P 888 d88P 'Y8888P' 'Y88888 888 888 'Y8888 'Y888888 888 8888888P' 8888888P' Fetching the latest database version... Fetching the host system architecture... Installing surreal-nightly for linux-amd64... SurrealDB successfully installed in: /root/.surrealdb/surreal To ensure that surreal is in your $PATH run: PATH=/root/.surrealdb:$PATH Or to move the binary to --nightly run: sudo mv /root/.surrealdb/surreal --nightly To see the command-line options run: surreal help To start an in-memory database server run: surreal start --log debug --user root --pass root memory For help with getting started visit: https://surrealdb.com/docs/
root@lxc-node-pd-1:~# PATH=/root/.surrealdb:$PATH root@lxc-node-pd-1:~# surreal sql --ns testns --db testdb -u root -p root --conn tikv://lxc-node-pd-1:2379
testns/testdb> create person:hugh content {name:'test'} [{ "id ": "person:hugh ", "name ": "test "}] testns/testdb> select * from person [{ "id ": "person:hugh ", "name ": "test "}]
Under a real scenario, you would have the SurrealDB nodes separate from the PD nodes, and the connections would be load balanced across the entire PD node pool.
As you can see, it is possible to set up SurrealDB in a cluster so that writes and reads can scale. Failure of a single node would have minimal disruption to the rest of the work while keeping your data intact. Backups can be performed against the TiKV cluster to ensure you can recover in the event of serious failures.
Hopefully, you found this guide helpful, and I look forward to hearing what you get up to with it!