Keep Calm and Shard On

SuperMassive is an open-source licensed under BSD-3, high-performance distributed key-value database designed for unlimited scale without compromise. It combines lightning-fast in-memory operations with robust reliability through intelligent sharding and parallel processing across unlimited nodes. The system features automatic self-healing, recovery, and synchronization capabilities.

SuperMassive's elegant timestamp-based consistency model enables effortless horizontal scaling, delivering exceptional throughput while maintaining zero-downtime resilience - simply add nodes to expand your capacity.

SuperMassive is the designed as a solution for mission-critical applications that demand a key-value database with speed, reliability, and scalability.

Features

Horizontal Scaling

Add nodes on demand to increase capacity. The cluster can refresh configurations and connect to new additions automatically.

Smart Sharding

Intelligent data distribution using sequence-based round-robin approach ensures balanced write operations across primary nodes.

High Availability

Automatic failover and self-healing capabilities ensure continuous operation even when nodes fail.

Data Consistency

Timestamp-based version control with automatic conflict resolution maintains data consistency across the cluster.

Parallel Operations

Read operations are done in parallel across primaries and (if need be) replicas. Write operations are done in sequence.

Multiplatform

SuperMassive is available for Linux, macOS, and Windows. It can be compiled from source or downloaded as pre-built binaries.

Quick Start Guide

To get SuperMassive for your system you can download the binaries on the release page on github. https://github.com/supermassivedb/supermassive/releases

1. Basic Setup

When starting a cluster, primary or replica with no .cluster .node .nodereplica new ones with defaults will be created on start up.

A shared key is always required for each instance type.

A cluster requires a username and password to be started. This can then be used to access the cluster through a client.

Terminal

# Start a cluster node
./supermassive --instance-type=cluster --username admin --password secure123 --shared-key cluster_key

# In another terminal, start a node
./supermassive --instance-type=node --shared-key cluster_key

# Start a replica, the replica should live in a different directory if doing this locally.
./supermassive --instance-type=nodereplica --shared-key cluster_key

2. Basic Operations

Client Connection Example

# Connect and authenticate
(echo -n "AUTH " && echo -n $"admin\\0secure123" | base64 && cat) | nc -C localhost 4000
OK authenticated

# Write data
PUT user:1001 '{"name": "John Doe", "email": "john@example.com"}'
OK key-value written

# Read data
GET user:1001
user:1001 {"name": "John Doe", "email": "john@example.com"}

# Delete data
DEL user:1001
OK key-value deleted

Note: The default cluster configuration uses port 4000 for the cluster, 4001 for the first node, and 4002 for the first replica.

Architecture

SuperMassive follows a distributed architecture with distinct component types.

Component Hierarchy

Component	Description	Role
Cluster	Central coordination unit	Manages node distribution, request routing, and health monitoring
Primary Node	Data storage unit	Handles write operations and maintains primary data copy with replica health monitoring
Replica Node	Redundancy unit	Maintains synchronized copy of primary node data

Data Flow

Write Operation Flow

1. Client → Cluster: Write request
2. Cluster → Primary Node: Route based on sequence
3. Primary Node → Journal: Async write
4. Primary Node → Replicas: Relay operation
5. Primary Node → Cluster: Confirmation | If primary node unavailable we go to write to next primary node in sequence
6. Cluster → Returns result to client

Authentication

SuperMassive implements a multi-layer authentication system to secure both client-cluster and inter-node communications.

Security Note: Always use secure passwords and protect your shared keys. Consider using environment variables for sensitive credentials.

Authentication Methods

Client Authentication
Clients must authenticate using base64-encoded credentials in the format username\0password. This is a form of basic authentication.
Inter-node Authentication
Nodes authenticate using a shared key specified during authentication phases.
TLS Support
Optional TLS encryption for all communications.

Configuration

Cluster Config Example

1 primary with 1 replica for that primary.

YAML Configuration


health-check-interval: 2
server-config:
    address: localhost:4000
    use-tls: false
    cert-file: /
    key-file: /
    read-timeout: 10
    buffer-size: 1024
node-configs:
    - node:
        server-address: localhost:4001
        use-tls: false
        ca-cert-file: ""
        connect-timeout: 5
        write-timeout: 5
        read-timeout: 5
        max-retries: 3
        retry-wait-time: 1
        buffer-size: 1024
      replicas:
        - server-address: localhost:4002
          use-tls: false
          ca-cert-file: ""
          connect-timeout: 5
          write-timeout: 5
          read-timeout: 5
          max-retries: 3
          retry-wait-time: 1
          buffer-size: 1024

Primary Config Example

To continue above 1 primary 1 replica for primary config.

YAML Configuration


health-check-interval: 2
max-memory-threshold: 75
server-config:
    address: localhost:4001
    use-tls: false
    cert-file: /
    key-file: /
    read-timeout: 10
    buffer-size: 1024
read-replicas:
    - server-address: localhost:4002
      use-tls: false
      ca-cert-file: /
      connect-timeout: 5
      write-timeout: 5
      read-timeout: 5
      max-retries: 3
      retry-wait-time: 1
      buffer-size: 1024

Replica Config Example

Replica configurations are simple! Just server related.

YAML Configuration


server-config:
    address: localhost:4002
    use-tls: false
    cert-file: /
    key-file: /
    read-timeout: 10
    buffer-size: 1024
max-memory-threshold: 75

Command Reference

SuperMassive supports a simple set of commands for data manipulation and cluster management.

Data Operations

Command	Description
PUT key value	Store a key-value pair
GET key	Retrieve a value by key
DEL key	Delete a key-value pair
INCR key [value]	Increment numeric value by amount
DECR key [value]	Decrement numeric value by amount
REGX pattern [offset] [limit]	Search keys using regular expression with optional pagination

Cluster & Node Operations

Command	Description
RCNF	Refresh configuration files across cluster and all nodes
PING	ping pong

Advanced Pattern Matching

REGX Examples

# Match user keys with ID between 1000-1999
REGX ^user:1[0-9]{3}$

# Find all session keys from today
REGX session:2025-02-23.*

# Get first 10 logs from February
REGX ^log:2025-02.* 0 10

# Find all temporary keys
REGX ^temp:.*:([0-9]+)$

# Example results
OK user:1001 {"data": 123}
user:1002 {"data": 1234}
user:1003 {"data": 12345}
user:1004 {"data": 123456}
..

Monitoring & Statistics

SuperMassive provides detailed statistics and monitoring capabilities through the STAT command.

Statistics Example

# Get full cluster stats
CLUSTER localhost:4000
    current_sequence 0
    client_connection_count 1
PRIMARY localhost:4001
DISK
    sync_enabled true
    sync_interval 128ms
    avg_page_size 1024.00
    file_mode -rwxrwxr-x
    is_closed false
    last_page 99
    storage_efficiency 0.9846
    file_name .journal
    page_size 1024
    total_pages 100
    total_header_size 1600
    total_data_size 102400
    page_utilization 1.0000
    header_overhead_ratio 0.0154
    file_size 104000
    modified_time 2025-02-23T04:39:31-05:00
MEMORY
    load_factor 0.3906
    grow_threshold 0.7500
    max_probe_length 2
    empty_buckets 156
    utilization 0.3906
    needs_grow false
    needs_shrink false
    size 256
    used 100
    shrink_threshold 0.2500
    avg_probe_length 0.2600
    empty_bucket_ratio 0.6094

Cluster Statistics

Statistic	Description	Example Value
current_sequence	Current primary node to in sequence next up to write to	32
client_connection_count	Amount of clients connected to cluster	2

Memory Statistics

Statistic	Description	Example Value
Basic Metrics
size	Total number of buckets in the hash table	32
used	Number of occupied buckets	20
load_factor	Ratio of used buckets to total size (used/size)	0.6250
Threshold Controls
grow_threshold	Load factor threshold that triggers table growth	0.7500
shrink_threshold	Load factor threshold that triggers table shrinking	0.2500
Performance Metrics
avg_probe_length	Average number of steps to find an item	1.2500
max_probe_length	Maximum number of steps needed to find an item	3
empty_buckets	Number of unused bucket slots	12
empty_bucket_ratio	Proportion of empty buckets to total size	0.3750
utilization	Same as load_factor, efficiency measure	0.6250

Disk Statistics

Statistic	Description	Example Value
File Information
file_name	Name of the pager file	data.db
file_size	Total size of the file in bytes	1048576
file_mode	File permissions	-rw-r--r--
modified_time	Last modification timestamp	2025-02-23T10:30:00Z
Configuration
page_size	Size of each page in bytes	4096
sync_enabled	Whether background syncing is enabled	true
sync_interval	How often the file is synced to disk	1s
is_closed	Whether the pager has been closed	false
Page Statistics
total_pages	Number of pages in the file	250
last_page	Index of the last page	249
Storage Metrics
total_header_size	Total size used by page headers (16 bytes per page)	4000
total_data_size	Actual data size without headers	1044576
storage_efficiency	Ratio of data size to total size	0.9962
avg_page_size	Average amount of data per page	4178.30
page_utilization	How full pages are on average	0.9845
header_overhead_ratio	Proportion of space used by headers	0.0038

Replication & Recovery

SuperMassive uses a journal-based replication system to ensure data consistency and fault tolerance.

Replication Process

Primary Write
Primary node writes data to journal and sends operation to replicas.
Replica Sync
Replicas receive operation and write to their own journal for recovery.
Recovery
At startup, all nodes recover from the journal. A replica will start a sync with its primary once the STARTSYNC command is transmitted; This happens on primary node health checks. The synchronization process is checkpoint-like in nature. Essentially, the replica informs the primary of its current journal position (for example, “My journal is at page 44”), and the primary then examines its own journal to send any missing entries. Once this process completes, the replica will be fully synchronized. Full synchronizations from an empty journal can take longer than those where the replica only needs to catch up on a few entries.

Recovery Process

Recovery Steps

1. Replica → Journal: Read journal entries



        
            Performance Tuning

            Node Memory Configuration
            
                Node Configuration
                max-memory-threshold: 75  # Percentage of max memory to cap at
            

            
                Performance Tip: Add more nodes to the cluster to prevent caps, allowing for more data storage and faster operations.