RepoPilotOpen in app →

erikgrinaker/toydb

Distributed SQL database in Rust, written as an educational project

Healthy

Healthy across all four use cases

weakest axis
Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 3mo ago
  • 4 active contributors
  • Apache-2.0 licensed
Show all 7 evidence items →
  • CI configured
  • Tests present
  • Small team — 4 contributors active in recent commits
  • Single-maintainer risk — top contributor 97% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Healthy
[![RepoPilot: Healthy](https://repopilot.app/api/badge/erikgrinaker/toydb)](https://repopilot.app/r/erikgrinaker/toydb)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/erikgrinaker/toydb on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: erikgrinaker/toydb

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/erikgrinaker/toydb shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

  • Last commit 3mo ago
  • 4 active contributors
  • Apache-2.0 licensed
  • CI configured
  • Tests present
  • ⚠ Small team — 4 contributors active in recent commits
  • ⚠ Single-maintainer risk — top contributor 97% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live erikgrinaker/toydb repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/erikgrinaker/toydb.

What it runs against: a local clone of erikgrinaker/toydb — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in erikgrinaker/toydb | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 112 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>erikgrinaker/toydb</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of erikgrinaker/toydb. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/erikgrinaker/toydb.git
#   cd toydb
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of erikgrinaker/toydb and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "erikgrinaker/toydb(\\.git)?\\b" \\
  && ok "origin remote is erikgrinaker/toydb" \\
  || miss "origin remote is not erikgrinaker/toydb (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "src/lib.rs" \\
  && ok "src/lib.rs" \\
  || miss "missing critical file: src/lib.rs"
test -f "src/raft/mod.rs" \\
  && ok "src/raft/mod.rs" \\
  || miss "missing critical file: src/raft/mod.rs"
test -f "src/server.rs" \\
  && ok "src/server.rs" \\
  || miss "missing critical file: src/server.rs"
test -f "src/sql/mod.rs" \\
  && ok "src/sql/mod.rs" \\
  || miss "missing critical file: src/sql/mod.rs"
test -f "src/storage/mod.rs" \\
  && ok "src/storage/mod.rs" \\
  || miss "missing critical file: src/storage/mod.rs"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 112 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~82d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/erikgrinaker/toydb"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

toyDB is an educational distributed SQL database written in Rust from scratch, featuring Raft consensus for linearizable state replication, ACID transactions with MVCC snapshot isolation, pluggable storage engines (BitCask and in-memory), and a complete SQL query engine with optimizer and time-travel support. It demonstrates how real distributed databases like CockroachDB and Neon work internally by prioritizing simplicity and correctness over performance. Monolithic Rust binary structure: src/ contains the core implementation split into raft/ (consensus), storage/ (engines and MVCC), sql/ (parser, planner, executor, optimizer), and networking layers; cluster/ provides docker-compose-style multi-node setup with five pre-configured nodes (toydb1-toydb5); docs/architecture/ contains guided tour documentation parallel to the codebase.

👥Who it's for

Computer science students, database engineers, and systems developers learning how distributed SQL databases work internally. Specifically useful for those working at database companies (like CockroachDB, Neon) or building their own data systems who need a concrete, understandable reference implementation.

🌱Maturity & risk

Actively developed educational project with comprehensive documentation in docs/architecture/ and architectural diagrams, but explicitly non-production: the README states performance, scalability, and availability are non-goals. Has CI/CD setup in .github/workflows/ci.yml and a working 5-node cluster setup (cluster/run.sh), indicating it's well-maintained by its author Erik Grinaker with recent rewrites based on production experience.

Low risk for educational use; deliberately simplified architecture means fewer dependencies (bincode, serde, crossbeam, clap, rustyline only) and no heavy external systems. Single maintainer (Erik Grinaker), but repository appears actively updated and explicitly educational so backwards compatibility is not a concern. Not suitable for production use—this is intentional by design.

Active areas of work

Unable to determine from file metadata alone, but the rewrite mentioned in README was recent and substantial. Repository structure and CI setup (ci.yml) suggest active maintenance. Check the GitHub commit history and issues for current work, but based on file dates being recent, the project is being actively maintained.

🚀Get running

Clone the repository and run the local cluster: git clone https://github.com/erikgrinaker/toydb.git && cd toydb && ./cluster/run.sh. This starts five Raft-replicated nodes on ports 9601-9605 (SQL) with Raft on 9701-9705. Then in another terminal: cargo run --release --bin toysql to connect to node 1.

Daily commands: Start cluster with ./cluster/run.sh (spawns five nodes). Connect via cargo run --release --bin toysql (connects to localhost:9601). Individual nodes run via cargo run --release --bin toydb -- -c cluster/toydbN/toydb.yaml where N is 1-5. Configuration in config/toydb.yaml and cluster/toydbN/toydb.yaml.

🗺️Map of the codebase

  • src/lib.rs — Main library entry point exposing all core modules (server, client, raft, sql, storage, encoding)—every contributor must understand the public API surface.
  • src/raft/mod.rs — Core Raft consensus implementation—essential for understanding how distributed state replication works in toyDB.
  • src/server.rs — SQL server and request processing layer—all SQL queries and transactions flow through here; critical for request-response patterns.
  • src/sql/mod.rs — SQL engine facade coordinating parser, planner, optimizer, and execution—the main abstraction for SQL workload handling.
  • src/storage/mod.rs — Storage engine abstraction and MVCC transaction support—fundamental to data persistence and snapshot isolation.
  • src/bin/toydb.rs — Main server binary entry point with clustering, configuration, and lifecycle management.
  • docs/architecture/overview.md — High-level architectural overview explaining the layered design and component relationships.

🛠️How to make changes

Add a new SQL function or aggregate

  1. Define the function variant in the AST enum (src/sql/ast.rs)
  2. Add parsing rules in the parser's expression handler (src/sql/parser.rs)
  3. Implement the function logic in the executor's evaluation method (src/sql/executor.rs)
  4. Add test cases in the SQL module test suite (src/sql/tests.rs)

Implement a new storage engine backend

  1. Create a new module implementing the Engine trait defined in src/storage/engine/mod.rs (src/storage/engine/mod.rs)
  2. Add instantiation logic in the storage module's engine factory (src/storage/mod.rs)
  3. Register the engine in the configuration parser (src/bin/toydb.rs)
  4. Add integration tests comparing against existing engines (src/storage/tests.rs)

Add a new Raft message type or consensus optimization

  1. Define the message variant in the Message enum (src/raft/message.rs)
  2. Handle the message type in the Raft node's state machine (src/raft/node.rs)
  3. Add state transitions and log updates in the Raft mod.rs orchestrator (src/raft/mod.rs)
  4. Test the new message with testscript scenarios (src/raft/testscripts/node/)

Add a new SQL optimizer rule or plan transformation

  1. Define the rule as a Plan transformation function (src/sql/optimizer.rs)
  2. Register the rule in the optimizer's optimization pass chain (src/sql/optimizer.rs)
  3. Add cost estimation logic if needed (src/sql/optimizer.rs)
  4. Test with query execution tests (src/sql/tests.rs)

🔧Why these technologies

  • Rust — Memory safety without GC; enables low-latency consensus and concurrent data structures; fits the educational goal of clear, correct code
  • Raft consensus — Simpler and more understandable than Paxos; provides linearizable replication required for correctness in distributed state machine
  • MVCC snapshot isolation — Allows readers to use stable snapshots without blocking writers; avoids complexity of 2-phase locking while meeting ACID goals
  • Iterator-based query executor — Clean abstraction for composable operators (filter, join, aggregate); easier to understand than volcano-style pull-based or batched SIMD approaches
  • Pluggable storage engines (BitCask + in-memory) — Demonstrates abstraction boundaries; BitCask is simple log-structured design; in-memory for testing and prototyping
  • Bincode + custom keycode encoding — Bincode is simple binary serialization for Raft logs; keycode provides ordered keys for range scans without adding lexicographic comparison overhead

⚖️Trade-offs already made

  • Single-node Raft state machine replicating entire database state

    • Why: Simplicity: all state goes through one log, eliminating split-brain and coordination complexity
    • Consequence: Doesn't scale to large state; commits are bottlenecked by Raft; suitable for learning, not production multi-terabyte databases
  • MVCC with snapshot-at-commit rather than fine-grained row versioning

    • Why: Drastically reduces bookkeeping and GC pressure; easier to reason about correctness
    • Consequence: Snapshot isolation instead of true serializable isolation; long-running transactions block new snapshots
  • Heuristic optimizer instead of cost-based statistics

    • Why: Avoids complexity of catalog statistics, selectivity estimation, and cardinality analysis
    • Consequence: Optimizer can miss better plans; no adaptive query execution or runtime feedback
  • Single-threaded executor with iterator-based evaluation

    • Why: Eliminates concurrency bugs and cache coherency complexity; straightforward to trace execution
    • Consequence: No parallelism within a query; no

🪤Traps & gotchas

Configuration files in cluster/toydbN/toydb.yaml specify node-specific ports and data directories—modify these if changing port assignments or cluster size. The MVCC implementation uses version timestamps; clock synchronization assumptions across nodes should be verified. BitCask engine uses file-based storage under cluster/toydbN/data/; these must be cleaned between fresh cluster starts. The Raft protocol implementation expects all nodes to be reachable; network partitions will cause split-brain scenarios typical of Raft without additional tooling.

🏗️Architecture

💡Concepts to learn

  • Raft Consensus — Raft is the core mechanism for replicating state across the cluster in a linearizable way; understanding it is essential to how toyDB achieves distributed correctness
  • MVCC (Multi-Version Concurrency Control) — MVCC enables snapshot isolation and time-travel queries by maintaining multiple versions of data; toyDB's entire transaction model depends on this pattern
  • BitCask — BitCask is one of toyDB's two storage engine options; understanding its append-only log design and key file indexing explains why MVCC layering is needed on top
  • Query Optimizer Heuristics — Heuristic optimization (vs. cost-based) is what makes toyDB's optimizer simple and understandable; it directly rewrites query plans without estimation
  • Iterator-Based Execution — toyDB's executor uses pull-based iterators rather than push-based dataflow; this is simpler to understand than modern vectorized engines
  • Snapshot Isolation — Snapshot isolation is toyDB's transaction isolation level; it's weaker than serializability but avoids phantom read anomalies and is what MVCC naturally provides
  • Time-Travel / Temporal Queries — toyDB supports reading historical versions of data via MVCC's version history; this is a distinctive educational feature showing how versioning enables point-in-time queries
  • cockroachdb/cockroach — Production distributed SQL database that inspired this project; Erik Grinaker's experience there directly influenced toyDB's design
  • neondatabase/neon — Another production database system where the author worked; shares architectural patterns with toyDB for compute-storage separation
  • etcd-io/etcd — Go-based distributed consensus system using Raft; useful reference for comparison of Raft implementations in different languages
  • sled-rs/sled — Rust embedded database with MVCC and log-structured storage similar to toyDB's BitCask approach
  • materialize-inc/materialize — Rust-based streaming database using Raft-like consensus; shows production-grade application of concepts toyDB teaches

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add integration tests for cluster mode using the existing cluster/ setup

The repo has a well-structured cluster/ directory with 5 pre-configured toydb instances (cluster/toydb1 through toydb5) and a cluster/run.sh script, but there are no visible integration tests that exercise the Raft consensus and distributed features. This is a critical gap for an educational distributed database project. New contributors could create goldenscript tests (already a dev dependency) that validate multi-node scenarios like leader election, state replication, and failover.

  • [ ] Create tests/cluster/ directory for cluster-specific test files
  • [ ] Write goldenscript tests that start the cluster via cluster/run.sh and verify Raft consensus behavior
  • [ ] Add test cases for: leader election, log replication across nodes, and state machine consistency after node failures
  • [ ] Document test setup in docs/testing.md referencing the cluster configuration files

Add comprehensive unit tests for the SQL query execution pipeline in src/sql

The file structure shows extensive SQL modules (sql-parser, sql-planner, sql-optimizer, sql-execution documented in docs/architecture/) but the actual src/ directory structure is truncated. A key gap for an educational project is unit tests validating the entire query execution pipeline from parsing → planning → optimization → execution. The project already disables doctests in Cargo.toml, suggesting custom test structure is preferred.

  • [ ] Examine all modules in src/ related to SQL processing (parser, planner, optimizer, executor)
  • [ ] Create tests/sql/ with separate test files for each stage: parser, planner, optimizer, and executor
  • [ ] Use test-case and test_each_file (already dev dependencies) to parameterize SQL test cases
  • [ ] Add golden test cases using goldenscript for complex queries (JOINs, aggregates, subqueries)

Expand docs/architecture/ with specific walkthrough examples for each module

The architecture documentation is well-structured with individual markdown files for major components (raft.md, mvcc.md, sql-parser.md, etc.), but there are no concrete code walkthrough files that trace a real user query from entry point through the system. Adding module-specific examples with code snippets would significantly improve the educational value.

  • [ ] Create docs/architecture/walkthrough/ directory for end-to-end examples
  • [ ] Add docs/architecture/walkthrough/query-execution.md tracing a SELECT query through src/sql modules with actual code references
  • [ ] Add docs/architecture/walkthrough/transaction-flow.md showing MVCC and ACID isolation with transaction examples
  • [ ] Add docs/architecture/walkthrough/replication.md demonstrating how a write command flows through Raft consensus using cluster setup
  • [ ] Link these walkthroughs from docs/architecture/index.md

🌿Good first issues

  • Add missing SQL feature support: Check docs/sql.md for feature list, then add test cases in goldenscript format to cover unimplemented SQL statements (like CREATE INDEX, ALTER TABLE, etc.) in the test suite
  • Expand optimizer rules: src/sql/planner/optimizer.rs likely has TODO comments for missing heuristics; implement additional query rewrite rules (e.g., predicate pushdown for joins) with corresponding goldenscript tests
  • Document a specific architectural component: Pick one subsystem (e.g., how Raft handles election timeouts, or how MVCC version vectors work), write a detailed markdown guide in docs/architecture/ with code examples and diagrams

Top contributors

Click to expand

📝Recent commits

Click to expand
  • 473afbd — rust-toolchain: upgrade to 1.93.1 (erikgrinaker)
  • 91783bb — Cargo.toml: upgrade deps (erikgrinaker)
  • 725f3ac — Cargo.toml: upgrade rand crate and simplify petname use (erikgrinaker)
  • 1c7dcea — docs: fix typo in example (erikgrinaker)
  • 8f48b07 — fix(docs): update outdated Bincode specification link (guuzaa)
  • 8f5dedd — cluster/run.sh: fix ctrl-c handling (erikgrinaker)
  • 16c7406 — storage/bitcask: tweak logging (erikgrinaker)
  • 4d2b63c — Cargo.toml: upgrade rand and petname (erikgrinaker)
  • 8f1fd6d — Cargo.toml: upgrade dependencies (erikgrinaker)
  • 8484c39 — README: reorder paragraphs (erikgrinaker)

🔒Security observations

This is an educational distributed SQL database project with significant security limitations by design. The most critical issue is the invalid Rust edition (2024) which breaks the build system and prevents proper dependency management. Configuration files lack encryption, data directories need stricter permissions, and the codebase appears to lack comprehensive input validation and authentication mechanisms. However, these limitations are consistent with the project's stated educational purpose. Users should not deploy this in production environments handling sensitive data. The project would benefit from prominent security disclaimers and documentation of its limitations.

  • High · Invalid Rust Edition in Cargo.toml — Cargo.toml. The Cargo.toml specifies edition = '2024', which does not exist. Valid editions are '2015', '2018', and '2021'. This will cause build failures and prevent dependency resolution, potentially leaving the project in an unstable state where security patches cannot be applied. Fix: Update 'edition = "2024"' to 'edition = "2021"' to use the latest stable Rust edition and ensure proper dependency management and security updates.
  • Medium · Unencrypted Configuration Files — cluster/toydb*/toydb.yaml, config/toydb.yaml. YAML configuration files (toydb.yaml) in the cluster directory are stored in plaintext without encryption. These files may contain sensitive database configuration, connection strings, or other credentials that could be exposed if the repository or deployment directories are compromised. Fix: Store sensitive configuration in environment variables or use encrypted secret management systems. Document which configuration values should be treated as secrets and implement a policy for handling credentials.
  • Medium · Missing Input Validation Framework — src/bin/toysql.rs, src/client.rs, src/encoding/. The codebase includes SQL parsing and command-line interfaces (src/bin/toysql.rs, src/bin/workload.rs) with limited visibility into input validation mechanisms. Educational projects may not implement comprehensive input validation, increasing risks of injection attacks or malformed data processing. Fix: Implement comprehensive input validation for all user-supplied data, particularly for SQL queries and configuration parameters. Use whitelist-based validation where possible. Add fuzzing tests for parser components.
  • Medium · Overly Permissive File Permissions on Data Directories — cluster/toydb*/data/. Database data directories (cluster/toydb*/data/) are created with .gitkeep files, suggesting they may have default permissions. Database files typically contain sensitive information and should have restricted access. Fix: Explicitly set file permissions to 0700 (owner read/write/execute only) for all database data directories. Document and enforce this in deployment scripts and the run.sh cluster initialization.
  • Low · Educational Project Lacks Security Warnings — README.md, docs/architecture/overview.md. The README and documentation focus on educational value and explicitly state that security is not a primary concern ('Shortcuts have been taken where possible'). Users may deploy toyDB in production scenarios without understanding its security limitations. Fix: Add a prominent security disclaimer in the README and main documentation clearly stating that toyDB is for educational purposes only and should not be used in production. List known security limitations and design shortcuts.
  • Low · No Apparent Authentication/Authorization — src/client.rs, src/bin/. No visible authentication or authorization mechanisms in the file structure (no auth middleware, access control lists, or credential handling). The client.rs and network communication may allow unauthenticated access. Fix: If this were production code, implement authentication (credentials/tokens), authorization (role-based access control), and transport security (TLS/mTLS). Document that the current version has no security enforcement.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Healthy signals · erikgrinaker/toydb — RepoPilot