Keep Calm and Shard On
SuperMassive is an open-source licensed under BSD-3, high-performance distributed key-value database designed for unlimited scale without compromise. It combines lightning-fast in-memory operations with robust reliability through intelligent sharding and parallel processing across unlimited nodes. The system features automatic self-healing, recovery, and synchronization capabilities.
SuperMassive's elegant timestamp-based consistency model enables effortless horizontal scaling, delivering exceptional throughput while maintaining zero-downtime resilience - simply add nodes to expand your capacity.
SuperMassive is the designed as a solution for mission-critical applications that demand a key-value database with speed, reliability, and scalability.
Features
Horizontal Scaling
Add nodes on demand to increase capacity. The cluster can refresh configurations and connect to new additions automatically.
Smart Sharding
Intelligent data distribution using sequence-based round-robin approach ensures balanced write operations across primary nodes.
High Availability
Automatic failover and self-healing capabilities ensure continuous operation even when nodes fail.
Data Consistency
Timestamp-based version control with automatic conflict resolution maintains data consistency across the cluster.
Parallel Operations
Read operations are done in parallel across primaries and (if need be) replicas. Write operations are done in sequence.
Multiplatform
SuperMassive is available for Linux, macOS, and Windows. It can be compiled from source or downloaded as pre-built binaries.
Quick Start Guide
To get SuperMassive for your system you can download the binaries on the release page on github. https://github.com/supermassivedb/supermassive/releases
1. Basic Setup
When starting a cluster, primary or replica with no .cluster
.node
.nodereplica
new ones with defaults will be created on start up.
A shared key is always required for each instance type.
A cluster requires a username and password to be started. This can then be used to access the cluster through a client.
# Start a cluster node
./supermassive --instance-type=cluster --username admin --password secure123 --shared-key cluster_key
# In another terminal, start a node
./supermassive --instance-type=node --shared-key cluster_key
# Start a replica, the replica should live in a different directory if doing this locally.
./supermassive --instance-type=nodereplica --shared-key cluster_key
2. Basic Operations
# Connect and authenticate
(echo -n "AUTH " && echo -n $"admin\\0secure123" | base64 && cat) | nc -C localhost 4000
OK authenticated
# Write data
PUT user:1001 '{"name": "John Doe", "email": "john@example.com"}'
OK key-value written
# Read data
GET user:1001
user:1001 {"name": "John Doe", "email": "john@example.com"}
# Delete data
DEL user:1001
OK key-value deleted
Architecture
SuperMassive follows a distributed architecture with distinct component types.
Component Hierarchy
Component | Description | Role |
---|---|---|
Cluster | Central coordination unit | Manages node distribution, request routing, and health monitoring |
Primary Node | Data storage unit | Handles write operations and maintains primary data copy with replica health monitoring |
Replica Node | Redundancy unit | Maintains synchronized copy of primary node data |
Data Flow
1. Client → Cluster: Write request
2. Cluster → Primary Node: Route based on sequence
3. Primary Node → Journal: Async write
4. Primary Node → Replicas: Relay operation
5. Primary Node → Cluster: Confirmation | If primary node unavailable we go to write to next primary node in sequence
6. Cluster → Returns result to client
Authentication
SuperMassive implements a multi-layer authentication system to secure both client-cluster and inter-node communications.
Authentication Methods
-
Client Authentication
Clients must authenticate using base64-encoded credentials in the format username\0password. This is a form of basic authentication.
-
Inter-node Authentication
Nodes authenticate using a shared key specified during authentication phases.
-
TLS Support
Optional TLS encryption for all communications.
Configuration
Cluster Config Example
1 primary with 1 replica for that primary.
health-check-interval: 2
server-config:
address: localhost:4000
use-tls: false
cert-file: /
key-file: /
read-timeout: 10
buffer-size: 1024
node-configs:
- node:
server-address: localhost:4001
use-tls: false
ca-cert-file: ""
connect-timeout: 5
write-timeout: 5
read-timeout: 5
max-retries: 3
retry-wait-time: 1
buffer-size: 1024
replicas:
- server-address: localhost:4002
use-tls: false
ca-cert-file: ""
connect-timeout: 5
write-timeout: 5
read-timeout: 5
max-retries: 3
retry-wait-time: 1
buffer-size: 1024
Primary Config Example
To continue above 1 primary 1 replica for primary config.
health-check-interval: 2
max-memory-threshold: 75
server-config:
address: localhost:4001
use-tls: false
cert-file: /
key-file: /
read-timeout: 10
buffer-size: 1024
read-replicas:
- server-address: localhost:4002
use-tls: false
ca-cert-file: /
connect-timeout: 5
write-timeout: 5
read-timeout: 5
max-retries: 3
retry-wait-time: 1
buffer-size: 1024
Replica Config Example
Replica configurations are simple! Just server related.
server-config:
address: localhost:4002
use-tls: false
cert-file: /
key-file: /
read-timeout: 10
buffer-size: 1024
max-memory-threshold: 75
Command Reference
SuperMassive supports a simple set of commands for data manipulation and cluster management.
Data Operations
Command | Description |
---|---|
PUT key value | Store a key-value pair |
GET key | Retrieve a value by key |
DEL key | Delete a key-value pair |
INCR key [value] | Increment numeric value by amount |
DECR key [value] | Decrement numeric value by amount |
REGX pattern [offset] [limit] | Search keys using regular expression with optional pagination |
Cluster & Node Operations
Command | Description |
---|---|
RCNF | Refresh configuration files across cluster and all nodes |
PING | ping pong |
Advanced Pattern Matching
# Match user keys with ID between 1000-1999
REGX ^user:1[0-9]{3}$
# Find all session keys from today
REGX session:2025-02-23.*
# Get first 10 logs from February
REGX ^log:2025-02.* 0 10
# Find all temporary keys
REGX ^temp:.*:([0-9]+)$
# Example results
OK user:1001 {"data": 123}
user:1002 {"data": 1234}
user:1003 {"data": 12345}
user:1004 {"data": 123456}
..
Monitoring & Statistics
SuperMassive provides detailed statistics and monitoring capabilities through the STAT command.
# Get full cluster stats
CLUSTER localhost:4000
current_sequence 0
client_connection_count 1
PRIMARY localhost:4001
DISK
sync_enabled true
sync_interval 128ms
avg_page_size 1024.00
file_mode -rwxrwxr-x
is_closed false
last_page 99
storage_efficiency 0.9846
file_name .journal
page_size 1024
total_pages 100
total_header_size 1600
total_data_size 102400
page_utilization 1.0000
header_overhead_ratio 0.0154
file_size 104000
modified_time 2025-02-23T04:39:31-05:00
MEMORY
load_factor 0.3906
grow_threshold 0.7500
max_probe_length 2
empty_buckets 156
utilization 0.3906
needs_grow false
needs_shrink false
size 256
used 100
shrink_threshold 0.2500
avg_probe_length 0.2600
empty_bucket_ratio 0.6094
Cluster Statistics
Statistic | Description | Example Value |
---|---|---|
current_sequence | Current primary node to in sequence next up to write to | 32 |
client_connection_count | Amount of clients connected to cluster | 2 |
Memory Statistics
Statistic | Description | Example Value |
---|---|---|
Basic Metrics | ||
size | Total number of buckets in the hash table | 32 |
used | Number of occupied buckets | 20 |
load_factor | Ratio of used buckets to total size (used/size) | 0.6250 |
Threshold Controls | ||
grow_threshold | Load factor threshold that triggers table growth | 0.7500 |
shrink_threshold | Load factor threshold that triggers table shrinking | 0.2500 |
Performance Metrics | ||
avg_probe_length | Average number of steps to find an item | 1.2500 |
max_probe_length | Maximum number of steps needed to find an item | 3 |
empty_buckets | Number of unused bucket slots | 12 |
empty_bucket_ratio | Proportion of empty buckets to total size | 0.3750 |
utilization | Same as load_factor, efficiency measure | 0.6250 |
Disk Statistics
Statistic | Description | Example Value |
---|---|---|
File Information | ||
file_name | Name of the pager file | data.db |
file_size | Total size of the file in bytes | 1048576 |
file_mode | File permissions | -rw-r--r-- |
modified_time | Last modification timestamp | 2025-02-23T10:30:00Z |
Configuration | ||
page_size | Size of each page in bytes | 4096 |
sync_enabled | Whether background syncing is enabled | true |
sync_interval | How often the file is synced to disk | 1s |
is_closed | Whether the pager has been closed | false |
Page Statistics | ||
total_pages | Number of pages in the file | 250 |
last_page | Index of the last page | 249 |
Storage Metrics | ||
total_header_size | Total size used by page headers (16 bytes per page) | 4000 |
total_data_size | Actual data size without headers | 1044576 |
storage_efficiency | Ratio of data size to total size | 0.9962 |
avg_page_size | Average amount of data per page | 4178.30 |
page_utilization | How full pages are on average | 0.9845 |
header_overhead_ratio | Proportion of space used by headers | 0.0038 |
Replication & Recovery
SuperMassive uses a journal-based replication system to ensure data consistency and fault tolerance.
Replication Process
-
Primary Write
Primary node writes data to journal and sends operation to replicas.
-
Replica Sync
Replicas receive operation and write to their own journal for recovery.
-
Recovery
At startup, all nodes recover from the journal. A replica will start a sync with its primary once the STARTSYNC command is transmitted; This happens on primary node health checks. The synchronization process is checkpoint-like in nature. Essentially, the replica informs the primary of its current journal position (for example, “My journal is at page 44”), and the primary then examines its own journal to send any missing entries. Once this process completes, the replica will be fully synchronized. Full synchronizations from an empty journal can take longer than those where the replica only needs to catch up on a few entries.
Recovery Process
1. Replica → Journal: Read journal entries
Performance Tuning
Node Memory Configuration
max-memory-threshold: 75 # Percentage of max memory to cap at