BurntSushi/ripgrep
ripgrep recursively searches directories for a regex pattern while respecting your gitignore
Single-maintainer risk — review before adopting
- ✓Last commit 2mo ago
- ✓5 active contributors
- ✓Unlicense licensed
- ✓CI configured
- ✓Tests present
- ⚠Small team — 5 top contributors
- ⚠Single-maintainer risk — top contributor 94% of commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Embed this verdict
[](https://repopilot.app/r/burntsushi/ripgrep)Paste into your README — the badge live-updates from the latest cached analysis.
Onboarding doc
Onboarding: BurntSushi/ripgrep
Generated by RepoPilot · 2026-05-05 · Source
Verdict
WAIT — Single-maintainer risk — review before adopting
- Last commit 2mo ago
- 5 active contributors
- Unlicense licensed
- CI configured
- Tests present
- ⚠ Small team — 5 top contributors
- ⚠ Single-maintainer risk — top contributor 94% of commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
TL;DR
ripgrep (rg) is a line-oriented search tool written in Rust that recursively searches directory trees for regex patterns, using the regex crate for pattern matching and the ignore crate to automatically respect .gitignore, .ignore, and similar exclude rules. It is significantly faster than grep, ag, and git-grep on most real-world workloads due to SIMD-accelerated literal search, memory-mapped I/O, and parallel directory traversal. It also supports optional PCRE2 patterns (via the pcre2 feature flag) for advanced regex features like look-around and backreferences. Monorepo: the binary entry point is crates/core/main.rs, and domain logic is split into internal workspace crates under crates/ — notably crates/ignore (gitignore/walk), crates/grep (facade over matchers), crates/regex and crates/pcre2 (regex backends), crates/searcher (file search engine), and crates/printer (output formatting). The top-level Cargo.toml wires these together and exposes the rg binary.
Who it's for
Developers, sysadmins, and power users who routinely search large codebases or filesystem trees and want grep-like functionality with automatic gitignore-awareness, Unicode correctness, and substantially better performance. It is also a reference implementation for contributors interested in high-performance Rust I/O, regex engines, and parallel file traversal.
Maturity & risk
ripgrep is extremely mature and production-ready: it is at v15.1.0, has been in active development since 2016, has a comprehensive CI pipeline at .github/workflows/ci.yml and a formal release workflow at .github/workflows/release.yml. Integration tests live under tests/tests.rs, and the CHANGELOG.md documents a long, stable release history. It is one of the most widely downloaded CLI tools in the Rust ecosystem.
Risk is very low for end users but moderate for contributors: the project is maintained primarily by a single person (BurntSushi / Andrew Gallant), creating a bus-factor risk for future direction and security patches. Core dependencies (regex, ignore, grep, searcher, printer) are all internal crates also maintained by BurntSushi, so upstream fixes and API changes are tightly coupled. The minimum Rust version is 1.85 (edition 2024), which may require recent toolchain upgrades in enterprise environments.
Active areas of work
Based on visible repo data, the project is at v15.1.0 with an active release pipeline (release.yml) and funding configuration (.github/FUNDING.yml). The edition was recently bumped to Rust 2024 (edition = "2024") with a rust-version floor of 1.85, indicating recent toolchain modernization. A dedicated profile.release-lto and profile.deb exist, suggesting ongoing packaging work for distribution.
Get running
git clone https://github.com/BurntSushi/ripgrep.git cd ripgrep
Build and run in debug mode:
cargo build ./target/debug/rg --version
Or install locally:
cargo install --path . --locked rg 'fn main' crates/
Run integration tests:
cargo test --test integration
Daily commands:
Development build:
cargo run -- 'your_pattern' path/to/search
Release build (with LTO):
cargo build --profile release-lto ./target/release-lto/rg 'your_pattern' .
With PCRE2 support:
cargo build --release --features pcre2
Map of the codebase
crates/core/main.rs— The binary entry point that wires together flag parsing, search configuration, and execution — the root of all program flow.crates/core/flags/defs.rs— Defines every CLI flag ripgrep supports; adding, removing, or changing any user-facing option starts here.crates/core/flags/hiargs.rs— Converts parsed low-level flag values into the high-level argument struct consumed by the search engine — the central configuration object.crates/core/search.rs— Implements the core search execution loop that reads files, runs the regex engine, and emits matches — the performance-critical hot path.crates/core/haystack.rs— Abstracts over the set of files/paths to search, integrating gitignore filtering and directory walking into a unified source of haystacks.crates/core/flags/parse.rs— Parses raw CLI arguments (including config file arguments) into structured low-level args; any new flag must be handled here.crates/core/flags/lowargs.rs— Defines the low-level intermediate representation of parsed flags before they are promoted to HiArgs — bridges parsing and configuration.
How to make changes
Add a new CLI flag
- Define the flag struct implementing the Flag trait (name, aliases, short, docs, update method that writes to LowArgs). (
crates/core/flags/defs.rs) - Add the corresponding field(s) to LowArgs and handle any default value. (
crates/core/flags/lowargs.rs) - In HiArgs::from_low_args (or its helpers), read the new LowArgs field and populate the HiArgs field with validated config. (
crates/core/flags/hiargs.rs) - Register the new flag in the FLAGS slice so the parser discovers it. (
crates/core/flags/mod.rs) - Consume the new HiArgs field in the search or haystack layer to alter behavior. (
crates/core/search.rs)
Add support for a new compressed file format
- Add a new decompressor entry mapping the file extension to the external binary command. (
crates/cli/src/decompress.rs) - Ensure the haystack layer passes compressed files through rather than skipping them based on binary detection. (
crates/core/haystack.rs) - Wire the decompressor into the search path so stdin from the spawned process is searched. (
crates/core/search.rs)
Add a new shell completion backend
- Create a new module (e.g. elvish.rs) implementing a generate function that iterates the FLAGS slice and emits completion syntax. (
crates/core/flags/complete/mod.rs) - Add the corresponding flag value variant and match arm to route --generate=elvish to the new generator. (
crates/core/flags/defs.rs) - Call the new generator from the generate dispatch block in main so it prints and exits. (
crates/core/main.rs)
Add a new output/printer format
- Add a new variant to the output format enum and its flag parsing logic. (
crates/core/flags/defs.rs) - Expose the new format variant through HiArgs so the search layer can select it. (
crates/core/flags/hiargs.rs) - Instantiate a new printer type (using grep-printer primitives) for the new format and dispatch to it in the search loop. (
crates/core/search.rs) - Use the shared writer abstraction so color and buffering work consistently with other formats. (
crates/cli/src/wtr.rs)
Why these technologies
- Rust — Zero-cost abstractions and memory safety without GC pauses are essential for a search tool that must process gigabytes of text with minimal latency and no crashes.
- grep-searcher / grep-regex crates (BurntSushi) — Provides a high-performance, encoding-aware search engine with SIMD literal acceleration and Unicode support, decoupled from the CLI so it can be used as a library.
- ignore crate (BurntSushi) — Implements .gitignore / .ignore / .rgignore semantics with parallel directory walking, which is the primary differentiator over plain GNU grep.
- regex crate (BurntSushi) — Guarantees linear-time matching (no catastrophic backtracking) and supports Unicode, making it safe to expose to untrusted patterns from users.
- termcolor crate — Portable ANSI and Windows Console color output without requiring a TTY-detection hack in ripgrep itself.
Trade-offs already made
-
Separate LowArgs / HiArgs two-phase flag processing
- Why: Allows flags to be parsed in any order without cross-flag dependencies at parse time, with validation and combination deferred to HiArgs construction.
- Consequence: Extra boilerplate per flag (three files to touch) but much cleaner validation logic and testability.
-
External decompressor processes (decompress.rs) instead of native Rust decompression libraries
- Why: Avoids large compile-time dependencies and lets users benefit from system-optimised tools (pigz, pbzip2) without ripgrep knowing about them.
- Consequence: Decompression requires the external binary to be installed; errors are less structured and startup latency per compressed file increases.
-
Respect .gitignore by default
- Why: The dominant use-case is searching inside a project repository, where respecting ignore rules avoids noise from vendor/build artefacts.
- Consequence: New users are sometimes surprised that files are silently skipped; requires
Traps & gotchas
- The
pcre2feature requires a systemlibpcre2(or bundled build via thepcre2Rust crate) — it is not compiled by default andcargo build --features pcre2will fail without the native library or build dependencies. 2) On 64-bit musl targets, jemalloc is automatically linked viatikv-jemallocator; this is silent but affects memory profiling. 3) Theedition = "2024"andrust-version = "1.85"mean older stable Rust toolchains (pre-Feb 2025) will refuse to compile. 4) Integration tests intests/tests.rsspawn the actualrgbinary, socargo test --test integrationrequires a successful prior build of the binary.
Architecture
Concepts to learn
- Gitignore rule parsing — The
crates/ignorecrate implements the full gitignore glob specification including negation, directory-scoped rules, and precedence — understanding this is essential to knowing which files rg skips. - Aho-Corasick multi-pattern search — ripgrep uses Aho-Corasick as a prefilter to quickly locate candidate lines containing literal substrings before applying the full regex, which is a major source of its speed advantage.
- Memory-mapped I/O (mmap) — The searcher crate can optionally use mmap to read files, avoiding kernel-to-userspace copies on large files — knowing when rg falls back to buffered reads vs. mmap is important for understanding its I/O performance.
- SIMD-accelerated byte search — ripgrep's underlying regex and memchr crates use SIMD intrinsics (SSE2/AVX2 on x86, NEON on ARM) to scan for literal bytes at hardware speed, which is why it outperforms grep on large files.
- Trait-based matcher abstraction — The
crates/matchercrate defines aMatchertrait that both theregexandpcre2backends implement, allowing the searcher and printer to be completely agnostic of the underlying regex engine. - jemalloc allocator — On musl 64-bit targets, ripgrep substitutes the system allocator with jemalloc via
tikv-jemallocatorto reduce fragmentation and improve throughput for the many small allocations made during parallel search. - LTO (Link-Time Optimization) — The
profile.release-ltoprofile enables fat LTO with a single codegen unit, allowing the Rust compiler to inline and optimize across crate boundaries — critical for the tight inner loops in ripgrep's search path.
Related repos
ggreer/the_silver_searcher— Direct predecessor/alternative: ag was the primary inspiration for ripgrep and solves the same problem but in C without gitignore-first design.BurntSushi/regex-automata— Companion repo: the lower-level regex engine library that ripgrep'scrates/regexcrate builds upon for SIMD and DFA-based matching.BurntSushi/walkdir— Ecosystem companion used in ripgrep's dev-dependencies; provides the foundational recursive directory iterator thatcrates/ignoreextends.sharkdp/fd— Close alternative in the same ecosystem: a fast Rust-based file finder that also respects gitignore and uses similar ignore/parallel-walk patterns.BurntSushi/aho-corasick— Companion repo: the multi-pattern literal search library used internally by ripgrep's regex backend for prefiltering before full regex evaluation.
PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add integration tests for --passthru flag edge cases in tests/tests.rs
The integration test suite in tests/tests.rs covers many rg flags, but the --passthru flag (which prints every line, matching or not) has known edge cases around binary file detection, null-separated output (-0), and interaction with context flags (-A/-B/-C). Adding explicit integration tests for these combinations would prevent regressions in a flag that is easy to break silently.
- [ ] Open tests/tests.rs and identify existing passthru test coverage by grepping for 'passthru'
- [ ] Add a test: --passthru combined with -A/-B/-C should not double-print context lines — verify correct line count in output
- [ ] Add a test: --passthru with binary files (using a fixture with null bytes) should respect --binary/--text flags consistently
- [ ] Add a test: --passthru with -0 (null-delimited output) should emit null bytes only at correct positions
- [ ] Add a test: --passthru with --count should be an error or produce documented behavior, verifying the exit code and stderr message
- [ ] Run the integration suite with
cargo test --test integrationto confirm all new tests pass
Add a GitHub Actions workflow for fuzzing crates/fuzz against the searcher and regex crates
The Cargo.toml exclude list explicitly references crates/fuzz, confirming a fuzz crate exists in the repo but is not wired into CI. Adding a scheduled or PR-triggered GitHub Actions workflow that runs cargo-fuzz for even a short duration (e.g., 60 seconds) against the searcher and regex crates would catch parser/searcher panics that unit tests miss. The existing .github/workflows/ directory has ci.yml and release.yml but no fuzz workflow.
- [ ] Inspect crates/fuzz to identify existing fuzz targets (likely fuzz_targets/ directory)
- [ ] Create .github/workflows/fuzz.yml with a scheduled trigger (e.g., weekly) and a manual workflow_dispatch trigger
- [ ] Install cargo-fuzz in the workflow using
cargo install cargo-fuzz - [ ] Add a job step that runs
cargo fuzz run <target> -- -max_total_time=60for each fuzz target found - [ ] Pin the nightly toolchain version in the workflow since cargo-fuzz requires nightly
- [ ] Upload any crash artifacts using actions/upload-artifact so they are accessible from the Actions UI
Document the PCRE2 feature flag and its trade-offs in GUIDE.md with a dedicated section
GUIDE.md is the primary user-facing documentation for ripgrep features, but the pcre2 feature (which enables --pcre2 / -P flags for Perl-compatible regex) is only briefly mentioned. The crates/pcre2 directory and the [features] section in Cargo.toml confirm this is a first-class build-time feature. New users frequently ask about look-around and backreferences — adding a dedicated GUIDE.md section with build instructions, capability comparison, and performance caveats would reduce repeated FAQ/issue noise.
- [ ] Read the existing PCRE2-related content in GUIDE.md and FAQ.md to avoid duplication
- [ ] Add a new section '## PCRE2 and Perl-compatible regex (-P / --pcre2)' in GUIDE.md
- [ ] Document how to build ripgrep with PCRE2 support:
cargo build --release --features pcre2and the libpcre2 system dependency requirement - [ ] Include a feature comparison table: features available only with --pcre2 (look-ahead, look-behind, backreferences, Unicode properties via \p{}) vs the default regex engine
- [ ] Add a performance caveat paragraph explaining that --pcre2 disables SIMD-accelerated search paths and when to prefer each engine
- [ ] Cross-reference crates/pcre2/README.md (if it exists
Good first issues
- Add integration tests for the
--statsoutput flag intests/tests.rs— coverage for JSON stats output format appears thin compared to standard output tests. 2) Thecrates/pcre2/src/crate lacks the depth of documentation present incrates/regex/src/— adding doc-comments explaining the PCRE2 Matcher trait implementation would help contributors understand the backend abstraction. 3) The benchsuite (benchsuite/benchsuite) has benchmark runs only up to 2018 — updating the benchmark harness and adding a modern run against current hardware and competitors (hypergrep, ugrep) would be a concrete, high-value contribution.
Top contributors
- @BurntSushi — 78 commits
- @ltrzesniewski — 2 commits
- @waldyrious — 1 commits
- @Pashugan — 1 commits
- @OctopusET — 1 commits
Recent commits
4519153— doc: clarify half-boundary syntax for the-w/--word-regexpflag (waldyrious)cb66736— core: bleat a DEBUG message when RIPGREP_CONFIG_PATH is not set (BurntSushi)9b84e15— ignore/types: addcontainertype that covers bothDockerfileandContainerfile(Pashugan)0a88ccc— Fix compression tests in QEMU cross-compilation environments (#3248) (OctopusET)cd1f981— fix: deriveDefaultwhen possible (xtqqczze)57c190d— ignore-0.4.25 (BurntSushi)85edf4c— ignore: only stat.jjif we actually care (ianloic)36b7597— changelog: start next section (BurntSushi)a132e56— pkg/brew: update tap (BurntSushi)af60c2d— 15.1.0 (BurntSushi)
Security observations
- Low · LTO Profile Disables Overflow Checks —
Cargo.toml [profile.release-lto]. The release-lto profile sets overflow-checks = false and debug-assertions = false. While this is a common performance optimization, disabling overflow checks removes a Rust safety net that catches integer overflow bugs at runtime. If any arithmetic in the codebase is vulnerable to overflow, this profile would silently allow wraparound behavior in release builds. Fix: Consider whether overflow-checks = false is strictly necessary. If performance requires it, ensure all arithmetic operations that could overflow are explicitly handled using saturating_, wrapping_, or checked_ variants. - Low · Panic = Abort May Complicate Error Recovery —
Cargo.toml [profile.release-lto]. The release-lto and deb profiles set panic = 'abort', which means any panic in the binary immediately terminates the process without running destructors or cleanup code. In a CLI tool this is generally acceptable, but if any temporary files or resources are created during processing, they may not be cleaned up on unexpected panics. Fix: Ensure that any temporary file or resource creation is wrapped with OS-level cleanup guarantees (e.g., using temp file libraries that register OS-level cleanup) so that panic = abort does not leave orphaned resources. - Low · Dependency on Unmaintained or Pinned Old Versions —
Cargo.toml [dependencies] and [dev-dependencies]. Several dependencies are pinned to older minor versions (e.g., serde/serde_derive = '1.0.77', log = '0.4.5', termcolor = '1.1.0', lexopt = '0.3.0'). While Rust's SemVer allows compatible updates, using old lower bounds means users could be building with outdated crate versions that have known bugs or security issues if cargo.lock is not carefully maintained. Fix: Periodically run 'cargo update' and audit dependencies with 'cargo audit' (using the RustSec advisory database) to ensure no known vulnerabilities exist in any transitive or direct dependencies. Update lower version bounds to recent stable versions. - Low · Use of External Allocator (jemalloc) on musl Targets —
Cargo.toml [target.'cfg(all(target_env = "musl", target_pointer_width = "64"))'.dependencies.tikv-jemallocator]. On 64-bit musl targets, tikv-jemallocator is used as the global allocator. Third-party allocator replacements introduce additional attack surface. If a vulnerability is discovered in the jemalloc library, it could potentially be exploited through memory allocation patterns (heap exploitation). Fix: Monitor the tikv-jemallocator crate and upstream jemalloc for security advisories. Keep the dependency updated to the latest patched version. Evaluate whether the performance benefit justifies the additional dependency risk on musl targets. - Low · CI Workflow Artifact and Release Integrity —
.github/workflows/release.yml, ci/sha256-releases. The repository includes SHA256 release checksums in ci/sha256-releases and a release workflow at .github/workflows/release.yml. If the CI pipeline or release workflow is misconfigured (e.g., allowing untrusted pull requests to trigger release builds, or not pinning GitHub Actions to specific commit SHAs), it could be vulnerable to supply chain attacks where a malicious actor manipulates release artifacts. Fix: Ensure GitHub Actions workflows pin all third-party actions to specific commit SHAs (not mutable tags like 'v3'). Use environment protection rules for the release environment. Validate that only maintainers can trigger release workflows. Verify release artifacts against the sha256-releases checksums as part of the release process. - Low · Potential Regex Denial of Service (ReDoS) via User-Supplied Patterns —
crates/core/main.rs, Cargo.toml [features] pcre2. ripgrep accepts arbitrary user-supplied regex patterns. While the regex crate used by ripgrep is designed to avoid catastrophic backtracking by using finite automata, the pcre2 feature enables use of PCRE2 which does support backtracking and could be vulnerable to ReDoS with crafted patterns when searching large files or directories. Fix: Document clearly that PCRE2 mode (-P/--pcre2) may be vulnerable to ReDoS with maliciously crafted patterns
LLM-derived; treat as a starting point, not a security audit.
Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.