tikv/raft-rs
Raft distributed consensus algorithm implemented in Rust.
Healthy across the board
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 3w ago
- ✓39+ active contributors
- ✓Distributed ownership (top contributor 23% of recent commits)
Show all 6 evidence items →Show less
- ✓Apache-2.0 licensed
- ✓CI configured
- ✓Tests present
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/tikv/raft-rs)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/tikv/raft-rs on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: tikv/raft-rs
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/tikv/raft-rs shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across the board
- Last commit 3w ago
- 39+ active contributors
- Distributed ownership (top contributor 23% of recent commits)
- Apache-2.0 licensed
- CI configured
- Tests present
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live tikv/raft-rs
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/tikv/raft-rs.
What it runs against: a local clone of tikv/raft-rs — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in tikv/raft-rs | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | Last commit ≤ 54 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of tikv/raft-rs. If you don't
# have one yet, run these first:
#
# git clone https://github.com/tikv/raft-rs.git
# cd raft-rs
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of tikv/raft-rs and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "tikv/raft-rs(\\.git)?\\b" \\
&& ok "origin remote is tikv/raft-rs" \\
|| miss "origin remote is not tikv/raft-rs (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 54 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~24d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/tikv/raft-rs"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
A pure Rust implementation of the Raft distributed consensus algorithm, providing only the core consensus module that can be embedded into systems requiring fault-tolerant state machine replication. It enables a cluster of nodes to reliably agree on a sequence of log entries even when some nodes fail or the network partitions, unlike generic consensus libraries this focuses solely on the algorithmic core and requires users to implement their own log storage, state machine, and transport layers. Workspace monorepo: root src/ contains the core consensus module, proto/ (raft-proto workspace member) handles message definitions in Protobuf/Prost, harness/ provides integration testing interfaces, and datadriven/ contains a data-driven test framework. Examples in examples/five_mem_node/ and examples/single_mem_node/ show reference implementations. Benches in benches/suites/ measure raft.rs, raw_node.rs, and progress tracking performance.
👥Who it's for
Systems engineers and distributed database developers (like those building TiKV or etcd) who need to add Raft-based replication to their Rust applications but want a minimal, composable consensus module rather than an all-in-one solution. Contributors are typically from the TiKV ecosystem who integrate this library into storage engines.
🌱Maturity & risk
Production-ready. The repo shows version 0.7.0 with Apache 2.0 licensing, full CI/CD pipeline in .github/workflows/ci.yml, comprehensive examples in examples/ directory (single and five-node setups), and active maintenance by the TiKV project. It targets stable Rust 1.44.0+ with MSRV clearly stated, and includes both protobuf and Prost codec support indicating real-world integration patterns.
Standard open source risks apply.
Active areas of work
The repo is actively maintained at version 0.7.0 with CI workflows (.github/workflows/ci.yml) running on commits. Dual codec support (protobuf-codec as default, prost-codec as alternative) indicates ongoing compatibility work. The CHANGELOG.md and PR templates (bug_fix.md, feature.md) show structured change management. No specific breaking PR details visible in file list, but the bors.toml presence suggests pull request automation.
🚀Get running
Clone the repository, ensure Rust 1.44+ is installed via rustup, then build with cargo build. Run examples via cargo run --example single_mem_node or cargo run --example five_mem_node. Run tests with cargo test. To use as a dependency, add raft = "0.7" to your Cargo.toml (defaults to protobuf-codec and default-logger features).
Daily commands:
Development: cargo test runs the test suite. Benchmarks: cargo bench (defined in benches/benches.rs with harness=false using criterion). Run examples: cargo run --example single_mem_node for a single-node demo or cargo run --example five_mem_node for a cluster demo. To use the default logger, the feature is enabled by default; disable with --no-default-features.
🗺️Map of the codebase
- Cargo.toml: Defines dual codec support (protobuf-codec default vs prost-codec), optional failpoints, and default-logger features; critical for understanding build configuration
- examples/five_mem_node/main.rs: Reference implementation showing how to compose Raft consensus with custom log storage, state machine, and transport—essential for understanding integration pattern
- .github/workflows/ci.yml: CI pipeline reveals test strategy, supported Rust versions, and quality gates that must pass before merge
- proto/: Protobuf definitions for all Raft message types (AppendEntries, RequestVote, etc.); changing this requires regenerating codec bindings
- datadriven/src/lib.rs: Data-driven testing framework used for differential testing; understand this to write new Raft behavior tests without hand-coding RPC sequences
🛠️How to make changes
For consensus algorithm changes, edit src/lib.rs (entry point) and subdirectories under src/ (exact structure not shown but standard Raft components like raft.rs, raw_node.rs inferred from bench names). For message formats, modify protobuf files in proto/ workspace. For integration testing, use datadriven/ test framework (see datadriven/src/testdata/ for examples). For new RPC behavior, update raft-proto definitions and regenerate with the codec feature. Add benchmarks by extending benches/suites/raft.rs.
🪤Traps & gotchas
The library is consensus-algorithm-only: you must implement Log (persistent or in-memory), State Machine (user logic), Transport (RPC), and Storage yourself—partial implementations will silently corrupt state. The choice between protobuf-codec and prost-codec is set at dependency time; switching requires recompilation of all dependents. Failpoint injection (fail crate) is optional but tests may rely on it—enable with --features failpoints to run full test suite. The default logger (slog-based) requires no config but uses stderr; override with --no-default-features if you want manual logging setup. No async runtime is provided; blocking consensus operations in multi-threaded code requires careful synchronization.
💡Concepts to learn
- State Machine Replication (SMR) — The core idea behind Raft: all nodes apply the same sequence of commands in order, ensuring identical state across the cluster; understanding this is essential to design correct State Machine and Log implementations
- Log Replication and Commitment — Raft ensures durability by replicating log entries to a majority before applying them to the state machine; you must correctly implement Log storage and track commitment indices to avoid data loss
- Leader Election via Randomized Timeouts — Raft avoids split-brain using randomized election timeouts to ensure one leader per term; the Transport layer must deliver heartbeats reliably or false leader elections occur
- Snapshot and Compaction — Raft logs can grow unbounded; snapshots allow discarding old committed entries, but incorrect snapshot handling breaks log consistency—critical if you implement Log compaction
- RPC Semantics (Request IDs, Deduplication) — Raft requires careful handling of at-most-once vs exactly-once semantics for RPC retries in the Transport layer; at-most-once is standard but requires tracking request IDs across failures
- Failpoints and Chaos Testing — The optional
failpointsfeature injects artificial failures (message loss, node crashes) to test Raft correctness without real hardware; essential for validating custom Log/Storage implementations - Data-Driven Testing — The repo uses
datadriven/to express Raft behavior as declarative test scripts rather than Rust code; useful for testing complex sequences without manual harness code
🔗Related repos
tikv/tikv— The primary consumer of raft-rs; TiKV is the distributed key-value store that supplies the Log, State Machine, and Transport implementations around this consensus moduleetcd-io/etcd— Go-based distributed consensus system using a different Raft implementation; useful reference for understanding how production systems layer Raft with storage and APIshashicorp/raft— Widely-used Go Raft implementation; useful for comparing design decisions, feature completeness, and API patterns between Rust and Go consensus librariestokio-rs/tokio— Async Rust runtime frequently paired with raft-rs in real systems since this library provides no built-in async support; understand Tokio integration patternstikv/raft-proto— Sibling workspace member providing Protobuf/Prost message definitions; changes to message format require coordinated updates across both repos
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive integration tests for flow control mechanisms in harness/tests/integration_cases/
The repo has test_raft_flow_control.rs but examining the structure, there's no dedicated test suite validating edge cases in flow control (backpressure, message batching limits, follower progress tracking). The Raft protocol's flow control is critical for preventing leader overload. Adding tests for scenarios like rapid follower failures, network congestion simulation, and progress reset would significantly improve robustness and catch regressions early.
- [ ] Review existing harness/tests/integration_cases/test_raft_flow_control.rs to identify coverage gaps
- [ ] Add test cases for: leader message batching limits, follower progress reset during network partition, rapid follower failures
- [ ] Add failpoints-based tests in harness/tests/failpoints_cases/mod.rs for flow control under adversarial conditions
- [ ] Document test scenarios in comments referencing Raft paper flow control sections
- [ ] Ensure tests run in CI via .github/workflows/ci.yml
Add benchmark suite for snapshot transfer performance in benches/suites/
The benches/suites/ directory has progress.rs and raft.rs benchmarks, but no dedicated snapshot transfer benchmarks. Snapshot handling is a critical performance path in Raft (mentioned in test_raft_snap.rs tests). Adding benchmarks for snapshot encoding/decoding performance, snapshot application latency, and large snapshot handling would provide valuable performance baselines and catch regressions.
- [ ] Create benches/suites/snapshot.rs with benchmarks for snapshot creation, serialization, and application
- [ ] Add benchmarks for various snapshot sizes (1MB, 100MB, 1GB) to measure scalability
- [ ] Benchmark both protobuf-codec and prost-codec paths (features from Cargo.toml)
- [ ] Update benches/suites/mod.rs to include the new snapshot benchmark suite
- [ ] Add criterion-based measurements for leader snapshot send latency and follower snapshot apply latency
Add data-driven tests for Raft state machine transitions in harness/tests/
The repo includes a datadriven testing framework (datadriven/ workspace) with examples in datadriven/src/testdata/, but it's underutilized in the main test suite. Complex state machine behavior (leader election, log replication, membership changes) could be elegantly tested using data-driven test cases. This would improve test maintainability, readability, and make it easier to add test cases for edge cases described in the Raft paper.
- [ ] Create harness/tests/datadriven_cases/ directory structure for state machine tests
- [ ] Add test data files (e.g., harness/tests/datadriven_cases/leader_election.txt) covering scenarios: initial election, split votes, follower timeout, etc.
- [ ] Implement datadriven test harness in harness/tests/ that parses test data and executes Raft state transitions
- [ ] Add test cases for membership changes and log replication edge cases
- [ ] Reference datadriven/src/datadriven_test.rs as implementation template
🌿Good first issues
- Add integration tests for the prost-codec path: the repo defaults to protobuf-codec, but prost-codec is an alternative. Create a test matrix in CI or a new test suite under
tests/that verifies both codecs produce identical serialization and can deserialize each other's messages.: Catches silent incompatibilities between codec implementations without requiring external codec expertise. - Document the Log trait contract with concrete examples: create a new
docs/implementing-log.mdfile showing how to implement the Log interface correctly, covering persistence guarantees, entry ordering, and snapshot handling. Reference it from examples.: The library explicitly requires users to implement Log, but the trait contract is non-obvious; many integrators get this wrong and silently lose data. - Add a
DESIGN.mdin the root explaining the four missing layers and provide a checklist for integrators: which parts of the consensus module interact with user code, what invariants must hold, and common pitfalls. Link from README.: New users are confused about what Raft.rs does and doesn't do; a brief design document prevents wasted integration effort.
⭐Top contributors
Click to expand
Top contributors
- @BusyJay — 23 commits
- @gengliqi — 7 commits
- @tisonkun — 6 commits
- @hicqu — 6 commits
- @jayzhan211 — 6 commits
📝Recent commits
Click to expand
Recent commits
53cf7a5— fix: bump rand to 0.9.3 for default builds (#586) (tillrohrmann)aafb07c— Avoid cloning byte arrays in public methods (#574) (jkosh44)b02c962— remove protobuf dependency withprost-codecfeature (#579) (ggirol-rc)deb3ba3— fix clippy warnings (#580) (ggirol-rc)1fd05e0— Add logs for dropping read index msg (#569) (gengliqi)5c932ef— proto: try to fix CI (#565) (lance6716)0d01b20— reset max_apply_unpersisted_log_limit in become_follower (#561) (glorv)2fbeee5— raft: next index shall be larger than match index (#557) (wego1236)63aec46— feat: add disable_proposal_forwarding config params (#552) (datbeohbbh)dfe2239— fix: fix clippy (#553) (datbeohbbh)
🔒Security observations
The raft-rs codebase demonstrates good security practices overall. It is a well-maintained consensus algorithm library with minimal dependencies and no obvious hardcoded secrets, SQL injection risks, or XSS vulnerabilities. The primary concerns are: (1) ensuring all dependencies, particularly the PRNG library, are kept up-to-date and audited, (2) preventing accidental enablement of testing features (failpoints) in production, and (3) maintaining clear security documentation. The codebase follows Rust best practices which provide memory safety guarantees. Regular dependency auditing via 'cargo audit' is recommended.
- Medium · Outdated Dependency: rand 0.9.3 —
Cargo.toml - dependencies section. The rand crate version 0.9.3 is specified in Cargo.toml. While rand 0.9.x is relatively recent, it's recommended to verify this is the latest stable version and that no known vulnerabilities exist for this specific version. PRNG libraries are security-sensitive. Fix: Run 'cargo audit' to check for known vulnerabilities. Consider updating to the latest stable version of rand and review the changelog for any security-related fixes. - Low · Potential Information Disclosure via Documentation —
media/ directory and documentation. The repository includes comprehensive documentation and design diagrams (media/the-design-of-raft-rs.png) that could potentially be used by attackers to understand implementation details and identify attack vectors. This is a minor concern for a consensus algorithm library. Fix: Ensure sensitive architectural details or security assumptions are clearly documented in threat models. Consider adding security advisories section to README. - Low · Failpoints Feature May Enable Unintended Code Paths —
Cargo.toml - features section. The 'failpoints' feature (fail/failpoints dependency) is designed for testing and chaos engineering. If accidentally enabled in production builds, it could allow injection of failures or unexpected behavior. Fix: Ensure failpoints feature is only enabled in dev/test builds. Add CI checks to verify failpoints is not enabled in release builds. Document the feature clearly.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.