This page is a slimmed down version of its counterpart at the README documentation for the surreal-sync tool.
The JSONL source in surreal-sync allows you to import JSON Lines (JSONL) files into SurrealDB. Each JSONL file becomes a table in SurrealDB, and each line in the file becomes a document in that table.
JSONL source is particularly useful for:
jq
can be used to convert JSON to JSONL)
surreal-sync sync jsonl \ --source-uri /path/to/jsonl/directory \ --to-namespace myns \ --to-database mydb
Before using JSONL source, ensure you have:
To start SurrealDB locally:
surreal start --user root --pass root
Let’s walk through an example using sample Notion-like data with pages, blocks, and databases.
Create a directory with the following JSONL files.
Note: each one of these has an id
field, which is required in SurrealDB. To set another field as the id field, pass in the flag --id-field "field_name"
.
databases.jsonl:
{"id": "db1", "name": "Documentation", "description": "Main documentation database", "created_at": "2023-12-01T10:00:00Z", "properties": {"status": "active", "version": 1.0}} {"id": "db2", "name": "API Docs", "description": "API reference documentation", "created_at": "2023-12-15T10:00:00Z", "properties": {"status": "beta", "version": 0.5}}
pages.jsonl:
{"id": "page1", "title": "Getting Started", "content": "Welcome to our documentation", "parent": {"type": "database_id", "database_id": "db1"}, "created_at": "2024-01-01T10:00:00Z"} {"id": "page2", "title": "Advanced Topics", "content": "Deep dive into advanced features", "parent": {"type": "page_id", "page_id": "page1"}, "created_at": "2024-01-02T10:00:00Z"} {"id": "page3", "title": "API Reference", "content": "Complete API documentation", "parent": {"type": "database_id", "database_id": "db2"}, "created_at": "2024-01-03T10:00:00Z"}
blocks.jsonl:
{"id": "block1", "type": "paragraph", "text": "This is a paragraph block", "parent": {"type": "page_id", "page_id": "page1"}, "order": 1} {"id": "block2", "type": "heading", "text": "Introduction", "level": 1, "parent": {"type": "page_id", "page_id": "page1"}, "order": 2} {"id": "block3", "type": "code", "text": "console.log('Hello World');", "language": "javascript", "parent": {"type": "page_id", "page_id": "page2"}, "order": 1} {"id": "block4", "type": "list", "items": ["Item 1", "Item 2", "Item 3"], "parent": {"type": "block_id", "block_id": "block2"}, "order": 3}
Use the following command to import the data with conversion rules for parent references:
surreal-sync sync jsonl \ --source-uri /workspace/tests/test_data/jsonl \ --to-namespace notion \ --to-database docs \ --surreal-endpoint http://surrealdb:8000 \ --surreal-username root \ --surreal-password root \ --rule 'type="database_id",database_id databases:database_id' \ --rule 'type="page_id",page_id pages:page_id' \ --rule 'type="block_id",block_id blocks:block_id'
NoteReplace
http://surrealdb:8000
withhttp://localhost:8000
if running SurrealDB locally on your machine.
If successful, the command will complete without output. You can verify the import worked by checking the database as shown in the next section.
The --rule
flag defines how to convert JSON objects into SurrealDB record links. The format is:
--rule 'type="TYPE_VALUE",ID_FIELD TARGET_TABLE:ID_FIELD'
For example:
type="page_id",page_id pages:page_id
means:"type": "page_id"
page_id
fieldpages:page1
if the page_id
is page1
.By default, surreal-sync looks for an id
field in each JSON object. You can specify a different field name:
surreal-sync sync jsonl \ --source-uri /path/to/jsonl \ --to-namespace myns \ --to-database mydb \ --id-field "item_id"
Example JSONL with custom ID field:
{"item_id": "prod1", "name": "Widget", "price": 19.99} {"item_id": "prod2", "name": "Gadget", "price": 29.99}
Control how many records are processed at once:
--batch-size 500
Test the import without actually writing data:
--dry-run
You can also use environment variables for configuration:
export SURREAL_ENDPOINT=http://localhost:8000 export SURREAL_USERNAME=root export SURREAL_PASSWORD=root export SOURCE_URI=/path/to/jsonl surreal-sync sync jsonl \ --to-namespace myns \ --to-database mydb
File Naming: Name your JSONL files exactly as you want your SurrealDB tables to be named.
ID Values: Ensure all documents have unique ID values within each file.
Data Types: JSONL source preserves JSON data types:
References: Use conversion rules to maintain relationships between documents across different tables.
Performance: For large datasets, adjust the batch size based on your system’s memory and SurrealDB’s capacity.
Missing ID Field: If you see “Missing ID field” errors, ensure your JSON objects have the ID field (default: “id”) or specify the correct field with --id-field
.
Invalid Rule Format: Conversion rules must follow the exact format. Check for proper quoting and spacing.
File Not Found: Ensure the source URI points to a directory containing .jsonl
files, not individual files.
Connection Issues: Verify SurrealDB is running and accessible at the specified endpoint.