01mf02/jaq
A jq clone focussed on correctness, speed, and simplicity
Healthy across all four use cases
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 3w ago
- ✓2 active contributors
- ✓MIT licensed
Show all 7 evidence items →Show less
- ✓CI configured
- ✓Tests present
- ⚠Small team — 2 contributors active in recent commits
- ⚠Single-maintainer risk — top contributor 87% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/01mf02/jaq)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/01mf02/jaq on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: 01mf02/jaq
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/01mf02/jaq shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- Last commit 3w ago
- 2 active contributors
- MIT licensed
- CI configured
- Tests present
- ⚠ Small team — 2 contributors active in recent commits
- ⚠ Single-maintainer risk — top contributor 87% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live 01mf02/jaq
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/01mf02/jaq.
What it runs against: a local clone of 01mf02/jaq — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in 01mf02/jaq | Confirms the artifact applies here, not a fork |
| 2 | License is still MIT | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 52 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of 01mf02/jaq. If you don't
# have one yet, run these first:
#
# git clone https://github.com/01mf02/jaq.git
# cd jaq
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of 01mf02/jaq and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "01mf02/jaq(\\.git)?\\b" \\
&& ok "origin remote is 01mf02/jaq" \\
|| miss "origin remote is not 01mf02/jaq (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
&& ok "license is MIT" \\
|| miss "license drift — was MIT at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "jaq-core/src/lib.rs" \\
&& ok "jaq-core/src/lib.rs" \\
|| miss "missing critical file: jaq-core/src/lib.rs"
test -f "jaq-core/src/compile.rs" \\
&& ok "jaq-core/src/compile.rs" \\
|| miss "missing critical file: jaq-core/src/compile.rs"
test -f "jaq-core/src/load/mod.rs" \\
&& ok "jaq-core/src/load/mod.rs" \\
|| miss "missing critical file: jaq-core/src/load/mod.rs"
test -f "jaq-core/src/data.rs" \\
&& ok "jaq-core/src/data.rs" \\
|| miss "missing critical file: jaq-core/src/data.rs"
test -f "jaq-core/src/funs.rs" \\
&& ok "jaq-core/src/funs.rs" \\
|| miss "missing critical file: jaq-core/src/funs.rs"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 52 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~22d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/01mf02/jaq"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
jaq is a Rust-based clone of jq that focuses on correctness, speed, and simplicity while maintaining compatibility with jq programs. Unlike jq, it supports additional data formats (YAML, CBOR, TOML, XML) and provides both a CLI tool and a library (jaq-core) that can be safely used in multi-threaded environments with arbitrary data types beyond JSON. Monorepo (Cargo workspace) with seven member crates: jaq-core (compiler/evaluator engine), jaq-std (standard library), jaq-json (JSON support), jaq-fmts (format handlers), jaq-all (aggregator), jaq (CLI), jaq-play (web playground). Core logic in jaq-core; format-specific code in jaq-fmts; documentation in docs/ as djot markup.
👥Who it's for
DevOps engineers and data processing specialists who need a drop-in replacement for jq with better startup performance (avoiding jq 1.6's ~50ms overhead) and support for non-JSON formats; Rust developers who need a jq compiler/evaluator library with thread-safe APIs and custom value type support.
🌱Maturity & risk
Production-ready and actively maintained. The codebase has 465k+ lines of Rust with comprehensive CI/CD (GitHub Actions workflows for check, test, MSRV, docs, release), semantic versioning (v3.0.0 released), and a published playground. The project demonstrates maturity through documentation, benchmarking infrastructure, and multi-format support.
Single-author maintenance risk (01mf02) with limited visible contributor diversity, though the stable release versioning and active CI suggest healthy release cadence. Breaking changes possible between major versions (v2→v3 existed), but the project's correctness-first philosophy and small codebase (relative to jq's C implementation) reduce defect risk. Monitor commit recency and issue triage responsiveness.
Active areas of work
Active development on format support and correctness improvements. Visible workflows for release automation, MSRV validation (Rust 1.69+), documentation generation, and playground deployment. The presence of cli-tests.jq and comprehensive benchmarks (examples/benches/) suggests ongoing performance tracking and compatibility validation.
🚀Get running
git clone https://github.com/01mf02/jaq.git
cd jaq
cargo build --release
./target/release/jaq --help
Daily commands:
cargo run --bin jaq -- '.foo'
cargo test
cargo bench # using benches in examples/benches/
./bench.sh # shell-based benchmarking against jq
🗺️Map of the codebase
jaq-core/src/lib.rs— Core library entry point; defines the public API for parsing, compiling, and executing jq filtersjaq-core/src/compile.rs— Filter compilation pipeline that transforms parsed AST into executable bytecode; essential for understanding execution modeljaq-core/src/load/mod.rs— Parser and lexer orchestration; handles jq program tokenization and AST constructionjaq-core/src/data.rs— Core data type definitions and JSON representation; foundational for all filter operationsjaq-core/src/funs.rs— Built-in function implementations (map, select, group_by, etc.); core library of jq operationsjaq-all/src/lib.rs— Workspace aggregation layer integrating all modules (core, std, json, formats) into unified libraryCargo.toml— Workspace configuration defining all member crates and shared dependency versions
🛠️How to make changes
Add a new built-in filter function
- Define the function signature in jaq-core/src/funs.rs in the main match statement handling filter names (
jaq-core/src/funs.rs) - Implement the filter logic using the Value type and return a Box<dyn Iterator<Item=Result<Value>>> (
jaq-core/src/funs.rs) - If needed, add the function name to the lexer keyword recognition in jaq-core/src/load/lex.rs (
jaq-core/src/load/lex.rs) - Add test cases in the function's doc-comment or create integration tests in docs/tests.jq (
docs/tests.jq)
Add a new data format (e.g., Protobuf)
- Create a new crate jaq-proto with format parsing/serialization code (
Cargo.toml) - Add the crate as a workspace member in the root Cargo.toml (
Cargo.toml) - Implement From<ProtoType> -> Value trait in jaq-all/src/data.rs to convert Protobuf to jaq values (
jaq-all/src/data.rs) - Register the format in jaq-all/src/load.rs with format detection logic (
jaq-all/src/load.rs)
Optimize a filter's execution performance
- Identify the filter in jaq-core/src/funs.rs where the performance bottleneck occurs (
jaq-core/src/funs.rs) - Consider if the filter can be optimized during compilation in jaq-core/src/compile.rs via pattern matching (
jaq-core/src/compile.rs) - Add fold/reduce optimizations in jaq-core/src/fold.rs if applicable for iterative operations (
jaq-core/src/fold.rs) - Add benchmark case in examples/benches/ and verify improvements with bench.sh (
bench.sh)
Extend the jq language with a new operator or syntax
- Add the token type to jaq-core/src/load/lex.rs and update tokenization logic (
jaq-core/src/load/lex.rs) - Extend the Filter AST in jaq-core/src/filter.rs with a new variant for the operator (
jaq-core/src/filter.rs) - Update the parser in jaq-core/src/load/mod.rs to recognize and construct the new syntax (
jaq-core/src/load/mod.rs) - Implement compilation logic in jaq-core/src/compile.rs to generate bytecode for the new operator (
jaq-core/src/compile.rs) - Document the feature in docs/advanced.dj or docs/corelang.dj and add tests in docs/tests.jq (
docs/tests.jq)
🔧Why these technologies
- Rust with iterators as core primitives — Enables lazy evaluation, memory efficiency, and zero-copy streaming; type-safe bytecode execution without garbage collection overhead
- Bytecode compilation model (AST → bytecode → interpreter) — Balances fast startup (no JIT delay) with reasonable execution speed; enables compile-time optimizations and error checking
- Multi-format support (JSON, YAML, CBOR, TOML, XML) via jaq-all aggregator — Differentiates from jq; allows single codebase to handle modern data interchange formats without rewriting filters
- Workspace structure (jaq-core + jaq-std + jaq-json + jaq-fmts) with composition — Decouples concerns; allows library consumers to depend on minimal core without dragging in all format support; enables modular reuse
⚖️Trade-offs already made
-
Iterator-based streaming execution instead of collecting intermediate arrays
- Why: Essential for processing multi-GB JSON files without memory explosion; jq itself struggles here
- Consequence: Some operations (e.g., multi-pass algorithms) require explicit materialization; code is more complex but memory-efficient
-
Compiled bytecode instead of direct AST interpretation
- Why: Measurably faster execution (see benchmarks); compile cost is negligible for typical filter reuse
- Consequence: Added compilation phase; error messages may be harder to map back to source; bytecode not portable
-
Value type abstraction in jaq-all instead of raw JSON
- Why: Supports arbitrary data types (YAML tags, CBOR types, XML attributes); maintains jq semantic compatibility
- Consequence: Extra indirection in hot path; requires trait implementations for new types; slightly slower than raw JSON-only clone
-
No mutable state / functional-only filter semantics
- Why: Matches jq; enables safe multi-threaded use and deterministic reproducibility
- Consequence: Cannot implement some imperative patterns; update operators must return new values instead of mutating
🚫Non-goals (don't propose these)
- Real-time streaming transformation with bidirectional filters (jq is request-response only)
- Interactive debugger or step-through execution (beyond error reporting)
- Drop-in replacement for jq runtime C API (library only, not lib
🪤Traps & gotchas
- MSRV is Rust 1.69+; older toolchains will fail silently (.github/workflows/msrv.yml enforces this)
- djot markup used for docs (not Markdown); requires understanding djot syntax to modify docs/*.dj files
- jaq-fmts requires optional features for each format (may need --features yaml,cbor,toml,xml when building)
- Benchmark comparisons in bench.sh assume jq binary is in PATH; will silently skip jq comparison if not present
- Monorepo member crates have independent versioning; jaq-core and jaq-std are v3.0.0 but jaq-fmts is v0.1.0 (early stage)
🏗️Architecture
💡Concepts to learn
- AST to Bytecode Compilation — jaq parses jq filter syntax into an AST, then compiles to an efficient bytecode format for execution; understanding this two-stage pipeline is essential for debugging parser/compiler bugs and adding new operators
- Trait-Based Value Abstraction (ValT) — jaq-core's ValT trait allows arbitrary custom data types to be processed by jq filters, not just JSON; this is key to jaq's library design and multi-format support
- Iterator-Based Evaluation Model — jaq evaluates filters lazily using Rust iterators rather than materializing all intermediate results; critical for memory efficiency on large datasets and understanding performance characteristics
- Format Abstraction Layer (jaq-fmts) — Decouples input/output format handling (YAML, CBOR, TOML, XML) from core filter logic; allows adding new formats without modifying jaq-core or jaq-std
- Workspace Crate Organization — The seven-crate workspace (jaq-core, jaq-std, jaq-json, jaq-fmts, jaq-all, jaq, jaq-play) enforces separation of concerns; understanding crate boundaries is essential for knowing where to make changes
- Compatibility Testing with jq — jaq validates correctness by comparing output against reference jq via shell tests (docs/shelltest.rs) and benchmark scripts (bench.sh); this regression detection is critical for maintaining the 'correctness' goal
- djot Markup Format — Project documentation is written in djot (docs/*.dj), not Markdown; contributors need to understand djot syntax to write or update docs accurately
🔗Related repos
jqlang/jq— The canonical C implementation of jq; jaq aims for behavioral compatibility while improving speed and safetystedolan/jq— Original jq repository (now archived/superseded by jqlang); reference implementation for correctness validationmwilliamson/python-jq— Python bindings to jq; alternative language ecosystem but solving same JSON transformation problemTomWright/dasel— Multi-format (JSON/YAML/TOML/XML) query tool in Go; direct competitor with similar format support goalsbeerus-cpp/jq— Another jq clone (C++ based) for performance; relevant for understanding alternative implementation approaches
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive shelltest cases for jaq CLI argument handling and edge cases
The repo has docs/cli-tests.jq and docs/shelltest.rs infrastructure for testing CLI behavior, but there's no visible dedicated test file for argument parsing, error handling, and format conversion edge cases. Given that jaq supports multiple formats (YAML, CBOR, TOML, XML) and is positioned as a jq drop-in replacement, comprehensive CLI tests would catch regressions in argument handling, format auto-detection, and error messages. This directly supports the 'correctness' focus stated in the repo description.
- [ ] Review existing docs/cli-tests.jq and docs/shelltest.rs to understand the testing framework
- [ ] Create docs/cli-edge-cases.jq with test cases for: invalid format flags, format conversion errors, stdin/file input combinations, and argument parsing edge cases
- [ ] Add tests for format auto-detection with ambiguous inputs across jaq-fmts (YAML, CBOR, TOML, XML)
- [ ] Run tests via GitHub Actions and ensure they pass before submitting PR
Add integration tests for jaq-fmts format converters (YAML, CBOR, TOML, XML)
The jaq-fmts crate exists as a workspace member but there are no visible dedicated integration tests validating format parsing and output. The examples/ directory has cbor-examples.xhtml and ferris.csv/tsv, but no structured tests for round-trip conversions or format-specific edge cases (e.g., YAML anchors, CBOR tags, TOML date handling). This is critical for a tool marketed as supporting multiple formats.
- [ ] Create tests/format_conversions.rs in the jaq-fmts crate to test each format
- [ ] Add test data files in tests/fixtures/ for: valid YAML (with anchors), CBOR (with tags), TOML (with dates), and XML (with namespaces)
- [ ] Test round-trip conversions: JSON → Format → JSON for each format
- [ ] Test error handling for malformed inputs in each format
- [ ] Integrate tests into the CI workflow (likely .github/workflows/test.yml)
Document and add tests for jaq-core library API with real-world examples
The README mentions jaq-core as a library for 'compile and run jq programs inside of Rust programs' but the docs/ directory (intro.dj, stdlib.dj, etc.) focus on CLI usage and jq language features, not the Rust API. Given this is positioned as a key library feature, adding concrete examples and integration tests would help Rust developers understand how to embed jaq. The docs/README.md mentions a Makefile for building docs, suggesting a structured docs system.
- [ ] Review jaq-core crate's public API in Cargo.toml and src/ to identify key entry points (likely compile() and run() functions)
- [ ] Create docs/library-usage.dj with practical Rust examples: compiling a filter, running it on JSON input, handling errors, and streaming results
- [ ] Add tests/lib_integration.rs to jaq-core with runnable examples (parsing filters, executing them, validating output types)
- [ ] Update jaq-all/jaq crate Cargo.toml to ensure doc examples are tested via
cargo test --doc
🌿Good first issues
- Add integration tests for YAML/CBOR/TOML/XML round-trip I/O in jaq-fmts/tests/ to match coverage of jaq-json/tests/; currently jaq-fmts lacks dedicated test files
- Expand docs/corelang.dj with examples for lesser-documented builtins (scan, splits, ltrimstr, rtrimstr) and add links to corresponding test cases in docs/tests.jq
- Add performance regression benchmarks to examples/benches/ for functions recently modified (e.g., group_by, sort_by) to catch regressions before release
📝Recent commits
Click to expand
Recent commits
9616019— Merge pull request #429 from 01mf02/docs-love (01mf02)3a27237— Update section on security audits. (01mf02)0f9bfa7— Correct link. (01mf02)8cbbe06— Merge pull request #423 from 01mf02/compound-non-key-heads (01mf02)86c4e75— More examples, more description. (01mf02)487103c— Adapt explanation of compound paths. (01mf02)15d4d9d— Remove outdated incompatibility claims. (01mf02)7822c87— Disable ligatures in code. (01mf02)edeeabd— Support compound paths with non-key heads. (01mf02)8cdd767— Merge pull request #419 from 01mf02/prepare-3.0 (01mf02)
🔒Security observations
The jaq codebase demonstrates generally good security practices. No critical vulnerabilities were identified in the static analysis. The project uses Rust, which provides memory safety guarantees. The presence of fuzzing infrastructure (jaq-core/fuzz) indicates security-conscious development. Main concerns are typical for data processing tools: ensuring robust input validation across multiple formats (JSON, YAML, CBOR, TOML, XML) and maintaining up-to-date dependencies. No hardcoded credentials, exposed secrets, or obvious injection vulnerabilities were detected in the visible structure. GitHub Actions workflows suggest active CI/CD security practices.
- Low · Release Profile Strip Configuration —
Cargo.toml - [profile.release] section. The release profile in Cargo.toml uses 'strip = true' which removes debugging symbols from binaries. While this reduces binary size, it may complicate security incident response and debugging in production environments. Fix: Consider the trade-offs between binary size and debuggability. For security-sensitive deployments, consider keeping debug symbols or providing separate debug symbol files for incident response. - Low · Workspace Resolver Version —
Cargo.toml - resolver = '2'. The codebase uses workspace resolver version 2, which is relatively modern. However, ensure all dependencies are regularly audited and updated to patch known vulnerabilities. Fix: Implement regular dependency audits using 'cargo audit' in CI/CD pipelines and keep dependencies up-to-date with security patches. - Low · Input Validation Concerns for JSON Processing —
jaq-json, jaq-fmts modules. As a jq clone that processes JSON/YAML/CBOR/TOML/XML, the tool handles multiple data formats. Without examining the actual parsing code, there's potential risk for malformed input handling, billion laughs attacks (XML expansion), or other denial-of-service vectors. Fix: Ensure all input parsers have proper limits on: recursion depth, entity expansion (for XML), string lengths, and number sizes. Implement fuzzing tests (already present in jaq-core/fuzz).
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.