zesterer/chumsky

Item: zesterer/chumsky
Rating: 5
Author: RepoPilot

[Chumsky has moved to Codeberg!] Write expressive, high-performance parsers with ease.

Healthy

Healthy across the board

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 6w ago
✓28+ active contributors
✓Distributed ownership (top contributor 34% of recent commits)

Show all 6 evidence items →

✓MIT licensed
✓CI configured
⚠No test directory detected

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/zesterer/chumsky)](https://repopilot.app/r/zesterer/chumsky)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/zesterer/chumsky on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: zesterer/chumsky

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/zesterer/chumsky shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across the board

Last commit 6w ago
28+ active contributors
Distributed ownership (top contributor 34% of recent commits)
MIT licensed
CI configured
⚠ No test directory detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live zesterer/chumsky repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/zesterer/chumsky.

What it runs against: a local clone of zesterer/chumsky — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in zesterer/chumsky | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 72 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>zesterer/chumsky</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of zesterer/chumsky. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/zesterer/chumsky.git
#   cd chumsky
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of zesterer/chumsky and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "zesterer/chumsky(\\.git)?\\b" \\
  && ok "origin remote is zesterer/chumsky" \\
  || miss "origin remote is not zesterer/chumsky (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "src/lib.rs" \\
  && ok "src/lib.rs" \\
  || miss "missing critical file: src/lib.rs"
test -f "src/combinator.rs" \\
  && ok "src/combinator.rs" \\
  || miss "missing critical file: src/combinator.rs"
test -f "src/primitive.rs" \\
  && ok "src/primitive.rs" \\
  || miss "missing critical file: src/primitive.rs"
test -f "src/error.rs" \\
  && ok "src/error.rs" \\
  || miss "missing critical file: src/error.rs"
test -f "src/input.rs" \\
  && ok "src/input.rs" \\
  || miss "missing critical file: src/input.rs"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 72 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~42d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/zesterer/chumsky"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Chumsky is a Rust parser combinator library that enables developers to write expressive, high-performance parsers for languages, binary protocols, and configuration files. It provides composable parser combinators with built-in error recovery, zero-copy parsing via references/slices, and supports context-free grammars, left recursion with memoization, and Pratt parsing for expression handling. Single-crate library structured as: src/ contains the core parser combinator framework; examples/ provides 12+ runnable examples (json.rs, brainfuck.rs, mini_ml.rs, etc.); benches/ includes performance tests (backtrack.rs, json.rs, cbor.rs); guide/ contains educational documentation. Features are feature-gated (std, stacker, memoization, pratt, debug, regex, serde) allowing granular opt-in for no_std and embedded use.

👥Who it's for

Language designers, compiler authors, and systems programmers building user-facing parsers (like compilers), networking protocol parsers, configuration file validators, and embedded systems requiring no_std parsing. Users value the expressiveness of combinators over grammar-based tools like PEST.

🌱Maturity & risk

Actively developed and production-ready. The project is at v0.13.0 with comprehensive CI/CD via GitHub Actions (rust.yml workflow), an extensive guide in guide/ directory covering key concepts and error recovery, and 643KB of Rust code. Regular updates via Cargo.lock management and examples show real-world use (Brainfuck interpreter, mini ML, nano Rust parsers). Moved from GitHub to Codeberg, indicating active stewardship.

Moderate risk: as a v0.13 library, breaking changes are possible (versioning indicates pre-1.0). Dependency surface includes optional features like regex-automata, lexical, serde, and stacker—most optional, reducing bloat. Single primary maintainer (zesterer) visible in history. No public issue backlog visible in file listing, but Codeberg migration may have caused lost GitHub issue history. The stacker feature for deep recursion adds system-level complexity.

Active areas of work

The project is actively maintained with recent work on the Codeberg migration, integration of nightly-only features (railroad diagram debugging), and optimization via GATs (Generic Associated Types). The Rust 2021 edition and MSRV of 1.65 indicate ongoing modernization. Feature additions like pratt parsing and memoization remain under 'unstable' flags, suggesting active experimentation.

🚀Get running

git clone https://github.com/zesterer/chumsky.git
cd chumsky
cargo build
cargo test
cargo run --example brainfuck -- examples/sample.bf

Daily commands: For development and examples:

cargo build --all-features
cargo test --all-features
cargo run --example json -- examples/sample.json
cargo run --example json_fast -- examples/sample.json
cargo bench

For no_std validation: cargo build --no-default-features.

🗺️Map of the codebase

src/lib.rs — Main library entry point; defines the Parser trait and re-exports all public APIs that users depend on
src/combinator.rs — Core combinator implementations (map, then, or, filter, etc.); fundamental to parser composition patterns
src/primitive.rs — Basic parser primitives (just, text, one_of, etc.); building blocks for all higher-level parsers
src/error.rs — Error type definitions and handling; critical for error reporting and recovery features
src/input.rs — Input stream abstraction; defines how parsers consume and track position in input
src/extra.rs — Extension trait with advanced combinators (separated, delimited, etc.); heavily used in practical parsers

🛠️How to make changes

Add a New Primitive Parser

Define the parser struct and implement the Parser trait in src/primitive.rs (src/primitive.rs)
Add a public factory function (e.g., pub fn my_parser() -> impl Parser<...>) in the same file (src/primitive.rs)
Re-export the factory function in src/lib.rs under the appropriate section (src/lib.rs)
Add an example usage in examples/ folder demonstrating the new parser (examples/debug.rs)

Add a New Combinator

Implement the combinator as a method on the Parser trait or extension trait in src/combinator.rs or src/extra.rs (src/combinator.rs)
Create a wrapper struct that implements Parser if the combinator needs state (e.g., struct MapCombinator { ... }) (src/combinator.rs)
Ensure the implementation handles spans and error recovery correctly using src/error.rs patterns (src/error.rs)
Document with doc-comments and add a usage example in the docstring or examples/ (examples/debug.rs)

Add a New Language/Format Parser Example

Create examples/my_lang.rs with a complete parser using existing primitives and combinators (examples/json.rs)
Create a sample input file examples/sample.my_lang with representative test input (examples/sample.json)
Use src/extra.rs combinators (separated, delimited, etc.) for common patterns (src/extra.rs)
Demonstrate error recovery and labeling from src/error.rs and src/label.rs for clarity (src/label.rs)

Improve Error Messages with Recovery

Use .label() from src/label.rs to annotate parser expectations (src/label.rs)
Apply recovery strategies from src/recovery.rs (e.g., .recover_with()) for graceful fallback (src/recovery.rs)
Chain error messages with .map_err() to provide context-specific diagnostics (src/combinator.rs)

🔧Why these technologies

Parser combinator pattern (trait-based) — Allows composable, reusable parser building blocks; enables type-safe parser composition without macros
Rust generics and associated types — Provides zero-cost abstractions; Parser trait is generic over input type and output type, enabling flexibility without runtime dispatch
regex-automata (optional dependency) — Provides performant regex matching for pattern-based parsing without the overhead of the regex crate
no_std support with optional std feature — Enables use in embedded/WebAssembly environments while still offering stdlib conveniences when available
Pratt parsing module — Handles operator precedence and associativity efficiently without left-recursion issues

⚖️Trade-offs already made

Trait-based parser design vs. macro-based DSL
- Why: Trait approach provides better IDE support, debugging, and composability
- Consequence: Slightly more verbose syntax than DSL-based parsers like nom or pest, but more flexible and type-safe
Generic Input type (not just &str)
- Why: Enables parsing of tokens, binary data, and custom input types without conversion
- Consequence: Adds complexity to trait bounds and input abstractions; users must implement Input trait for custom types
Error recovery as first-class feature
- Why: Supports user-facing compilers needing to report multiple errors and continue parsing
- Consequence: More complex error type; recovery strategies must be manually applied (not automatic)
Memoization/caching as optional module, not built-in
- Why: Allows users to opt into caching when needed without overhead for all parsers
- Consequence: Cache misses require explicit parser wrapping; users must manage cache lifecycle

🚫Non-goals (don't propose these)

Does not provide automatic left-recursion elimination; users must manually handle left-recursive grammars with recursive() or restructuring
Does not include built-in lexer/tokenizer generation; users must write or integrate separate lexing stage
Does not generate parser code from grammar files (e.g., EBNF); grammar must be expressed in Rust code
Does not provide automatic error recovery (passive only); recovery strategies must be explicitly applied by users
Does not support Unicode grapheme clusters natively; text operations work at char/byte level

🪤Traps & gotchas

GAT optimization: Parser internals use Rust GATs heavily; upgrading Rust versions may expose unstable behavior. Stacker feature: Adds platform-specific stack management; disabled by default but required for deeply nested grammars—omitting it will cause stack overflow on recursive inputs. Feature interactions: nightly feature unlocks unstable APIs; debug requires both nightly and unstable, failing silently if not enabled together. Codeberg migration: Repository has moved from GitHub; old GitHub links in issues/PRs may be stale. MSRV 1.65: Trying older toolchains will fail; no compatibility with pre-2021 edition. Zero-copy trade-off: Parser output holds references to input; lifetime constraints can complicate AST design if you need owned data.

🏗️Architecture

💡Concepts to learn

Parser Combinators — The entire design philosophy of Chumsky; understanding how primitive parsers compose monadic-ally into complex grammars is essential to using this library effectively.
Generic Associated Types (GATs) — Chumsky's internal optimizer uses GATs to enable zero-cost abstractions; understanding GAT patterns helps explain why Chumsky is performant despite abstraction.
Zero-Copy Parsing — A core feature of Chumsky that minimizes allocation by having parser outputs hold references/slices to the input; critical for understanding performance characteristics and lifetime constraints.
Left Recursion and Memoization — Chumsky supports left-recursive grammars via optional memoization (unstable feature); necessary for parsing expression grammars and operator precedence without rewriting.
Pratt Parsing — Chumsky has first-class Pratt parsing support (pratt feature) for elegant expression parsing; avoids complex precedence climbing or operator precedence rules.
Context-Free Grammars (CFG) — Chumsky explicitly supports CFGs and context-sensitive extensions; knowing CFG theory helps understand what Chumsky can and cannot parse.
Error Recovery Strategies — Chumsky's error recovery is a key differentiator (documented in guide/error_and_recovery.md); understanding recovery modes (greedy, backtracking, etc.) is essential for building user-friendly parsers.

zesterer/ariadne — Companion error-reporting library by the same author; used in Chumsky's example diagnostics to render parser errors beautifully.
rust-lang/nom — Alternative Rust parser combinator library with similar goals; key competitor for bytecode/binary protocol parsing with different ergonomics and macro-based syntax.
pest-parser/pest — Grammar-based parser (PEG) in Rust; positioned as a contrast to Chumsky's combinator approach—simpler syntax but less expressive than composable combinators.
tree-sitter/tree-sitter — Incremental parser generator for editors and IDEs; complements Chumsky for use cases requiring partial/streaming parse trees and language server support.
zesterer/tao — Complete programming language implementation by Chumsky's author; real-world reference implementation showing Chumsky in production for a full compiler.

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive tests for src/pratt.rs Pratt parsing combinator

The Pratt parsing feature (src/pratt.rs) is gated behind the 'pratt' feature flag which depends on 'unstable', but there are no dedicated test files or examples demonstrating Pratt parser usage. Given that Pratt parsing is a powerful but non-trivial pattern, adding tests would help contributors understand the API and catch regressions. This is a high-value addition since pratt.rs is a core feature with no example file.

[ ] Create tests/pratt_parser.rs with unit tests covering basic prefix/infix/postfix operator parsing
[ ] Add a practical example in examples/pratt_math.rs showing expression parsing with operator precedence
[ ] Document expected behavior in guide/meet_the_parsers.md for the Pratt combinator section
[ ] Ensure tests run with: cargo test --features pratt,nightly,unstable

Add integration tests for memoization feature with complex recursive parsers

The 'memoization' feature flag (src/cache.rs) claims to enable left recursion and speed up backtracking, but there are no dedicated tests verifying this works correctly or benchmarking the performance improvements. This is critical for a feature that fundamentally changes parser behavior (enabling left recursion). New contributors can validate the feature and add regression tests.

[ ] Create tests/memoization.rs with tests for left-recursive grammars that would fail without memoization
[ ] Add a benchmark in benches/memoization.rs comparing same parser with/without memoization feature
[ ] Add an example in examples/left_recursive.rs demonstrating a grammar that requires memoization
[ ] Document memoization tradeoffs and usage in guide/technical_notes.md

Add tests and documentation for src/recovery.rs error recovery strategies

The recovery.rs module handles error recovery (a key selling point mentioned in README), but guide/error_and_recovery.md exists without corresponding unit tests. The module lacks test coverage for different recovery strategies, making it hard for contributors to understand or modify recovery logic safely.

[ ] Create tests/recovery_strategies.rs testing various recovery patterns: skip_until, synchronize, nested recovery
[ ] Add detailed examples in guide/error_and_recovery.md with code snippets from tests/recovery_strategies.rs
[ ] Create examples/error_recovery_demo.rs showing practical error recovery in a mini language parser
[ ] Document recovery performance characteristics and when to use each strategy

🌿Good first issues

Add missing unit tests for the regex feature module (src/regex or wherever regex combinators live)—the feature exists but has minimal test coverage in the benches/ and examples/.
Expand guide/debugging.md with a concrete example showing railroad diagram generation via the debug feature; currently guide exists but no runnable example demonstrates cargo run --example debug.
Add a complete, documented example parsing a subset of TOML (building on json.rs pattern) to guide/examples/ or examples/; demonstrates real-world configuration parsing use case mentioned in README.

⭐Top contributors

Click to expand

@Hedgehogo — 34 commits
@zesterer — 32 commits
@zeichenreihe — 3 commits
@Zollerboy1 — 2 commits
@tmke8 — 2 commits

📝Recent commits

Click to expand

4879268 — Fixed incorrect docs (zesterer)
3451910 — select!: allow #[cfg()] gates (#970) (Tpt)
6a38ac6 — Simplified indent example (zesterer)
0aac20f — Improve indentation example to reject malformed indent (#961) (JohnathanFL)
1e2b51a — Change the visibility of InputRef::full_slice() from pub(crate) to public (#960) (cnglen)
869918d — Remove redundant bound from . (zesterer)
3a4bc57 — Make RichReason generic over custom error type (#962) (icewind1991)
5730d3e — Implement ConfigParser for OneOf (#957) (Zollerboy1)
639f586 — Fix DefaultExpected::into_owned lifetime (#952) (Zollerboy1)
2531947 — add map_span to Rich error (#950) (ojkelly)

🔒Security observations

Chumsky is a well-structured Rust parser library with strong security posture. No critical or high-severity vulnerabilities were identified. The codebase shows no evidence of hardcoded secrets, SQL injection risks, XSS vulnerabilities, or infrastructure misconfigurations. The project uses a conservative MSRV (1.65), has feature-gated dependencies, and maintains clear separation of concerns. Minor observations include incomplete documentation and consideration of default feature implications. The MIT license and transparent GitHub presence are positive security indicators.

Low · Incomplete Cargo.toml Feature Documentation — Cargo.toml (features section). The Cargo.toml file has a truncated comment in the 'all_stable' feature section ('An alias of all features that work with the stable compiler...' ends with 'If you'). While this is not a security vulnerability per se, incomplete documentation can lead to misuse of features and unclear security boundaries. Fix: Complete the documentation comment for the 'all_stable' feature to clearly explain its purpose and any security implications.
Low · Optional Dependency 'stacker' Default Enabled — Cargo.toml (default features). The 'stacker' feature is enabled by default in the 'default' feature set. This dependency is used for dynamic stack spilling to enable deeper recursion. While useful for functionality, it adds complexity and a dependency that may not be needed for all use cases. Fix: Document the security implications of the stacker dependency. Consider whether it should be enabled by default or if users should explicitly opt-in based on their recursion depth requirements.
Low · Unstable Features in Public API — Cargo.toml (pratt, debug, lexical-numbers features). Several features (pratt, debug, lexical-numbers) depend on the 'unstable' feature flag, indicating their APIs are not settled. This could lead to breaking changes and security-related API modifications in future versions. Fix: Clearly communicate stability guarantees in documentation. Consider stabilizing frequently-used features and providing migration guidance for breaking changes.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

zesterer/chumsky

Embed the "Healthy" badge

Onboarding doc

Onboarding: zesterer/chumsky

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🛠️How to make changes

Add a New Primitive Parser

Add a New Combinator

Add a New Language/Format Parser Example

Improve Error Messages with Recovery

🔧Why these technologies

⚖️Trade-offs already made

🚫Non-goals (don't propose these)

🪤Traps & gotchas

🏗️Architecture

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add comprehensive tests for src/pratt.rs Pratt parsing combinator

Add integration tests for memoization feature with complex recursive parsers

Add tests and documentation for src/recovery.rs error recovery strategies

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next