RepoPilotOpen in app →

microsoft/FASTER

Fast persistent recoverable log and key-value store + cache, in C# and C++.

Mixed

Stale — last commit 1y ago

worst of 4 axes
Use as dependencyMixed

last commit was 1y ago; no CI workflows detected

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isMixed

last commit was 1y ago; no CI workflows detected

  • 17 active contributors
  • MIT licensed
  • Tests present
Show 3 more →
  • Stale — last commit 1y ago
  • Concentrated ownership — top contributor handles 59% of recent commits
  • No CI workflows detected
What would change the summary?
  • Use as dependency MixedHealthy if: 1 commit in the last 365 days
  • Deploy as-is MixedHealthy if: 1 commit in the last 180 days

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Forkable" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Forkable
[![RepoPilot: Forkable](https://repopilot.app/api/badge/microsoft/faster?axis=fork)](https://repopilot.app/r/microsoft/faster)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/microsoft/faster on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: microsoft/FASTER

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/microsoft/FASTER shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

WAIT — Stale — last commit 1y ago

  • 17 active contributors
  • MIT licensed
  • Tests present
  • ⚠ Stale — last commit 1y ago
  • ⚠ Concentrated ownership — top contributor handles 59% of recent commits
  • ⚠ No CI workflows detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live microsoft/FASTER repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/microsoft/FASTER.

What it runs against: a local clone of microsoft/FASTER — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in microsoft/FASTER | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 412 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>microsoft/FASTER</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of microsoft/FASTER. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/microsoft/FASTER.git
#   cd FASTER
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of microsoft/FASTER and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "microsoft/FASTER(\\.git)?\\b" \\
  && ok "origin remote is microsoft/FASTER" \\
  || miss "origin remote is not microsoft/FASTER (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "cc/src/core/faster.h" \\
  && ok "cc/src/core/faster.h" \\
  || miss "missing critical file: cc/src/core/faster.h"
test -f "cc/src/core/async.h" \\
  && ok "cc/src/core/async.h" \\
  || miss "missing critical file: cc/src/core/async.h"
test -f "cc/src/index/hash_table.h" \\
  && ok "cc/src/index/hash_table.h" \\
  || miss "missing critical file: cc/src/index/hash_table.h"
test -f "cc/src/core/address.h" \\
  && ok "cc/src/core/address.h" \\
  || miss "missing critical file: cc/src/core/address.h"
test -f "cc/src/device/storage.h" \\
  && ok "cc/src/device/storage.h" \\
  || miss "missing critical file: cc/src/device/storage.h"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 412 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~382d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/microsoft/FASTER"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

FASTER is a high-performance concurrent persistent key-value store with integrated logging (available in C# and C++) designed to handle workloads larger than memory by leveraging fast external storage. It provides a non-blocking checkpointing technique for consistent recovery and can saturate disk bandwidth with very frequent commits at low latency, solving the hard problem of managing large application state resiliently in cloud environments. Dual-implementation monorepo: cs/ contains C# implementation (NuGet package Microsoft.FASTER.Core), cc/ contains C++ implementation with clear separation into cc/src/core (core data structures: faster.h, address.cc, alloc.h), cc/src/common (logging), cc/benchmark-dir (YCSB benchmarks), and cc/playground (examples like sum_store, recovery tools). Both share similar architectural patterns around log, index, and checkpoint abstractions.

👥Who it's for

Cloud infrastructure engineers and database systems developers building low-latency, high-throughput data systems that need to handle state larger than RAM with automatic recovery. Users include teams building caches, session stores, and distributed databases who need orders-of-magnitude performance improvements over traditional KV stores.

🌱Maturity & risk

Production-ready and actively maintained. The project has comprehensive CI/CD via Azure Pipelines (azure-pipelines.yml, azure-pipelines-full.yml), dual-language implementations (C# at 3.9M LOC, C++ at 1.3M LOC), and a published NuGet package (Microsoft.FASTER.Core). Active development is evident from the organized test suites and benchmark infrastructure in cc/benchmark-dir/, though specific commit recency and issue count require checking the live repo.

Low-to-moderate risk for adopters. The codebase is substantial (5M+ LOC combined) with both C# and C++ implementations requiring separate maintenance effort. Primary risks: dual-language architecture requires expertise in both ecosystems; the complex memory-mapped I/O and concurrent recovery logic (checkpoint_state.h, checkpoint_state_f2.h, gc_state.h) makes debugging non-trivial; external storage dependency means misconfiguration can silently degrade performance rather than fail loudly.

Active areas of work

Active development across both C# and C++ with focus on recovery tooling (cc/playground/recovery-info/ with read_index_metadata.cc, read_log_metadata.cc suggests recent work on metadata inspection). The presence of concurrent_recovery_test.h and checkpoint_state_f2.h suggests ongoing optimization of the F2 (fast forward) recovery mechanism. Azure Pipelines setup indicates continuous integration and regular testing cycles.

🚀Get running

Clone the repo: git clone https://github.com/microsoft/FASTER.git. For C# development: navigate to cs/ and use dotnet build. For C++ development: navigate to cc/ and run mkdir build && cd build && cmake .. && make. See cc/README.md and the main README.md for language-specific setup details.

Daily commands: C#: Open solution in Visual Studio or run dotnet run in relevant project directory within cs/. C++: From cc/: cmake -B build && cmake --build build && ./build/bin/sum_store for playground examples, or ./build/bin/benchmark from cc/benchmark-dir build output. Requires local filesystem or Azure Blob storage support (see cc/scripts/linux/azure/blob.sh).

🗺️Map of the codebase

  • cc/src/core/faster.h — Core FASTER KV store template class—entry point for all key-value operations; every contributor must understand its design.
  • cc/src/core/async.h — Async operation handling and completion infrastructure; critical for understanding the non-blocking request model.
  • cc/src/index/hash_table.h — Hash table index implementation; fundamental to lookup performance and collision handling.
  • cc/src/core/address.h — Address abstraction for log locations; essential for understanding persistence and recovery semantics.
  • cc/src/device/storage.h — Abstract storage device interface; needed to understand device pluggability and I/O abstraction.
  • cc/src/core/checkpoint_state.h — Checkpoint state machine management; critical for durability and recovery mechanisms.
  • cc/src/core/record.h — Log record serialization and layout; foundational for understanding persistence format.

🛠️How to make changes

Add a new operation (e.g., Compare-And-Swap)

  1. Define operation context type in cc/src/core/internal_contexts.h with Input/Output/Context structs. (cc/src/core/internal_contexts.h)
  2. Add operation logic to FASTER template in cc/src/core/faster.h—handle in-memory hit, log write, async callback. (cc/src/core/faster.h)
  3. Register operation dispatch in cc/src/core/async.h for async enqueue if needed. (cc/src/core/async.h)
  4. Write unit test in cc/test/ directory (e.g., cas_test.cc) covering success/conflict cases. (cc/test/in_memory_test.cc)

Add support for a new storage backend (e.g., local RocksDB-backed device)

  1. Implement storage.h interface in a new cc/src/device/rocksdb_disk.h file. (cc/src/device/storage.h)
  2. Implement Read(), Write(), Delete(), Trim() methods matching the async callback signature. (cc/src/device/storage.h)
  3. Register device type in cc/src/core/config.h or use template instantiation in faster.h. (cc/src/core/config.h)
  4. Add device-specific test in cc/test/blobs/ or create rocksdb_test.cc. (cc/test/blobs/storage_test.cc)

Extend checkpoint/recovery with a new metadata type

  1. Define metadata struct in cc/src/core/checkpoint_state.h; add serialization helpers. (cc/src/core/checkpoint_state.h)
  2. Add metadata read/write methods to storage interface in cc/src/device/storage.h. (cc/src/device/storage.h)
  3. Update recovery logic in faster.h to deserialize and apply new metadata during recovery. (cc/src/core/faster.h)
  4. Test recovery flow in cc/test/f2_recovery_test.cc or add new recovery_metadata_test.cc. (cc/test/f2_recovery_test.cc)

Optimize hash table bucket layout (e.g., for cache efficiency)

  1. Review bucket structure and slot allocation in cc/src/index/hash_bucket.h. (cc/src/index/hash_bucket.h)
  2. Modify slot packing, alignment, or lock strategy in hash_bucket.h. (cc/src/index/hash_bucket.h)
  3. Update hash_table.h resize and probe logic if bucket layout changes. (cc/src/index/hash_table.h)
  4. Benchmark with cc/benchmark-dir/benchmark.cc; verify throughput improvement without regression. (cc/benchmark-dir/benchmark.cc)

🔧Why these technologies

  • C++ with header-only templates (faster.h) — Enables compile-time specialization for key/value types; minimal runtime overhead and zero-copy semantics for custom types.
  • Epoch-based GC (light_epoch.h) — Lock-free memory reclamation without stop-the-world pauses; critical for tail latency in high-throughput work

🪤Traps & gotchas

Memory-mapped I/O assumptions: The design assumes fast, reliable storage (NVMe or cloud blob storage); slow/unreliable storage will cause performance collapse without clear error signals. Epoch management complexity: Garbage collection and memory reclamation depend on proper epoch advancement; failure to drive epochs forward (e.g., in single-threaded recovery_test.h) can cause unbounded memory growth. Address space limitations: 64-bit addresses are divided into fields for epoch/offset/size (see address.h); misunderstanding these bit layouts causes silent data corruption. Checkpoint atomicity: The F2 checkpoint mechanism assumes specific ordering; concurrent modifications to checkpoint_state during a checkpoint can violate recovery guarantees. No built-in persistence format versioning: Upgrading FASTER versions with different on-disk layouts requires careful migration (see recovery-info tools). Azure Blob integration non-obvious: blob.sh and the storage abstraction hide cloud storage latency; local testing may not catch cloud-specific issues.

🏗️Architecture

💡Concepts to learn

  • facebook/rocksdb — Leading production LSM-tree KV store; FASTER's main competitor for embedded KV workloads, especially for workloads that don't exceed memory
  • WireProtocol/WireProtocol — Distributed KV store that can leverage FASTER as a local storage backend for individual nodes
  • couchbase/couchbase-server — Enterprise KV store with similar goals (state management at scale); architecture study reference for distributed deployment patterns FASTER enables

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive C++ unit tests for async result types and internal contexts

The repo has core async infrastructure (cc/src/core/async.h, async_result_types.h, internal_contexts.h, internal_contexts_f2.h) but no visible test suite in cc/benchmark-dir or cc/playground. New contributors can add a proper test directory under cc/test/ with unit tests for async callbacks, result handling, and context state transitions—critical for correctness in a concurrent key-value store.

  • [ ] Create cc/test/ directory with CMakeLists.txt following the pattern of cc/benchmark-dir/CMakeLists.txt
  • [ ] Add unit tests for async_result_types.h covering all result type variants and state transitions
  • [ ] Add unit tests for internal_contexts.h and internal_contexts_f2.h covering context initialization, lifecycle, and F2 variant differences
  • [ ] Add tests for key_hash.h and record.h serialization/deserialization
  • [ ] Wire tests into the main cc/CMakeLists.txt and azure-pipelines.yml

Add memory allocator unit tests (lss_allocator, malloc_fixed_page_size, persistent_memory_malloc)

The repository has three memory allocators (cc/src/core/lss_allocator.cc/h, malloc_fixed_page_size.h, persistent_memory_malloc.h) but no visible test coverage. These are critical for correctness and performance—memory corruption here cascades through the entire system. A new contributor can create focused tests validating allocation, deallocation, fragmentation patterns, and thread safety.

  • [ ] Create cc/test/allocator_tests.cc with fixtures for each allocator
  • [ ] Add tests for lss_allocator.cc covering concurrent allocation/deallocation, boundary conditions, and free list management
  • [ ] Add tests for malloc_fixed_page_size.h validating page alignment and page pool exhaustion handling
  • [ ] Add tests for persistent_memory_malloc.h covering persistence guarantees across checkpoints
  • [ ] Add thread-safety stress tests using the thread utilities in cc/src/core/thread.h

Add C# integration tests for checkpoint state transitions and recovery paths

The C++ core has sophisticated checkpoint state machines (cc/src/core/checkpoint_state.h, checkpoint_state_f2.h, phase.h, state_transitions.h) but the C# wrapper bindings are not visible in the file list. A new contributor can add integration tests in the C# layer that exercise the full lifecycle: normal checkpoints, F2 (fast & fresh) checkpoints, recovery from different checkpoint states, and concurrent recovery—validating the critical contracts between C# and C++.

  • [ ] Locate or create cs/test/ directory for C# integration tests
  • [ ] Add integration tests for standard checkpoint flow: Prepare → InProgress → Capture → Persist → Recover
  • [ ] Add tests for F2 checkpoint variant lifecycle (separate from checkpoint_state.h path)
  • [ ] Add tests for recovery from partial checkpoint states (e.g., crash during Persist phase)
  • [ ] Add concurrent recovery tests using the patterns in cc/playground/sum_store-dir/concurrent_recovery_test.h
  • [ ] Wire tests into azure-pipelines.yml for CI validation

🌿Good first issues

  • Add integration tests for cc/playground/recovery-info/ tools (read_index_metadata.cc, read_log_metadata.cc) that verify metadata can be correctly parsed from corrupted or partial checkpoints—currently no test coverage visible in file list
  • Document the address.h bit layout and address translation examples in code comments; the 64-bit field packing (epoch/offset/size) is non-obvious and causes bugs, yet lacks examples
  • Create a cc/examples/getting_started.cc tutorial showing minimal KV store setup, Put/Get/Delete sequence, and checkpoint recovery—playground examples exist but are domain-specific (sum_store)

Top contributors

Click to expand

📝Recent commits

Click to expand
  • 7f71289 — Fix VS version in cpp documentation (#938) (kkanellis)
  • 2feb26b — [C++] F2 KV store (#922) (kkanellis)
  • eb4e976 — Updated Microsoft.Extensions.* nuget packages for security compliance (#930) (darrenge)
  • a62a036 — Updated system.text.json to ver 8.0.5 so it is compliant with security (#929) (darrenge)
  • 4ae0465 — Bump MessagePack from 2.4.35 to 2.5.187 in /cs/playground/AsyncStress (#927) (dependabot[bot])
  • e25782f — Fix flush with generic allocator so that we do not write invalid data to disk in rare cases (record was invalid when mak (badrishc)
  • e9e56ed — Fix ContinuePendingRead to account for new records for the same key that were added and then went to dusk during the pen (TedHartMS)
  • dc4bc1d — [C#] Fix commit record pointer offset calculation (#912) (asolodov)
  • 281bdff — use IsReferenceOrContainsReferences for .NET Standard 2.1+ and 6_0+ (#901) (Tornhoof)
  • 1111326 — Port the Unsafe.SizeOf Fixes from garnet (#900) (Tornhoof)

🔒Security observations

The FASTER codebase is a mature C++ and C# database/log project with reasonable security posture. No critical vulnerabilities or hardcoded secrets were identified in the file structure provided. The main security concerns relate to the complexity of concurrent memory management, file I/O operations, and state synchronization in a

  • Low · Missing SECURITY.md Content — SECURITY.md. The SECURITY.md file appears to be incomplete or truncated. The security vulnerability reporting section is cut off mid-sentence, which may leave users uncertain about the proper reporting procedures. Fix: Complete the SECURITY.md file with full security reporting instructions, including the complete URL and any additional security contact information.
  • Low · Potential Unsafe Memory Operations in C++ Code — cc/src/core/malloc_fixed_page_size.h, cc/src/core/lss_allocator.h, cc/src/core/persistent_memory_malloc.h. The codebase contains C++ components (cc/src/) with custom memory management (malloc_fixed_page_size.h, lss_allocator.h, persistent_memory_malloc.h). While not necessarily vulnerable, custom memory allocators require careful review for buffer overflows, use-after-free, and other memory safety issues. Fix: Conduct thorough code review of memory allocation routines. Consider using AddressSanitizer and MemorySanitizer during testing. Document assumptions and constraints for each allocator.
  • Medium · Concurrent Access Control in Hash Table Implementation — cc/src/index/hash_table.h, cc/src/index/mem_index.h, cc/src/core/checkpoint_locks.h. The hash table implementation (cc/src/index/hash_table.h) and memory index (cc/src/index/mem_index.h) manage concurrent access patterns. Improper synchronization could lead to race conditions, data corruption, or inconsistent state in the persistent key-value store. Fix: Verify that all concurrent access patterns are properly protected with appropriate locking mechanisms or lock-free primitives. Add comprehensive concurrent stress testing and use thread sanitizers during CI/CD.
  • Low · File I/O Permission Handling — cc/src/environment/file_windows.cc, cc/src/environment/file_linux.cc, cc/src/device/file_system_disk.h. The file system implementations (cc/src/environment/file_windows.h, cc/src/environment/file_linux.h) handle persistent storage operations. Improper permission checks could expose data or allow unauthorized access. Fix: Ensure proper file permissions are set on all created files and directories. Validate and sanitize file paths to prevent directory traversal attacks. Document security assumptions for deployed installations.
  • Low · Azure Blob Storage Integration Security — cc/src/device/azure.h, cc/scripts/linux/azure/blob.sh. The codebase includes Azure blob storage support (cc/src/device/azure.h, cc/scripts/linux/azure/blob.sh). Cloud storage credentials and connection strings need proper handling to avoid exposure. Fix: Ensure connection strings and authentication credentials are never hardcoded. Use environment variables or managed identities for Azure authentication. Review shell scripts for credential exposure in logs or command history.
  • Low · Recovery and Checkpoint State Management — cc/src/core/recovery_status.h, cc/src/core/checkpoint_state.h, cc/src/core/gc_state.h. Complex state management in recovery (cc/src/core/recovery_status.h, cc/src/core/checkpoint_state.h) and garbage collection (cc/src/core/gc_state.h) could have edge cases or race conditions during recovery scenarios. Fix: Document state machine transitions thoroughly. Add comprehensive test coverage for recovery scenarios, including partial failures and concurrent operations during recovery.
  • Low · No Visible Input Validation Framework — cc/src/core/config.h, cc/src/core/. While the codebase is primarily a data store (not a web application), there's no evident centralized input validation for configuration parameters, log entries, or API inputs. Fix: Implement centralized input validation for all configuration parameters and API inputs. Validate data types, ranges, and formats. Document assumptions about data validity.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Mixed signals · microsoft/FASTER — RepoPilot