RepoPilotOpen in app →

AmrDeveloper/GQL

GitQL is a extensible SQL-like query language and SDK to perform queries on various data sources such .git files with supports of most of SQL features such as grouping, ordering and aggregation and window functions and allow customization like user-defined types and functions

Mixed

Single-maintainer risk — review before adopting

weakest axis
Use as dependencyMixed

top contributor handles 99% of recent commits; no tests detected

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 2w ago
  • 2 active contributors
  • MIT licensed
Show all 7 evidence items →
  • CI configured
  • Small team — 2 contributors active in recent commits
  • Single-maintainer risk — top contributor 99% of recent commits
  • No test directory detected
What would change the summary?
  • Use as dependency MixedHealthy if: diversify commit ownership (top <90%)

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Forkable" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Forkable
[![RepoPilot: Forkable](https://repopilot.app/api/badge/amrdeveloper/gql?axis=fork)](https://repopilot.app/r/amrdeveloper/gql)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/amrdeveloper/gql on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: AmrDeveloper/GQL

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/AmrDeveloper/GQL shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

WAIT — Single-maintainer risk — review before adopting

  • Last commit 2w ago
  • 2 active contributors
  • MIT licensed
  • CI configured
  • ⚠ Small team — 2 contributors active in recent commits
  • ⚠ Single-maintainer risk — top contributor 99% of recent commits
  • ⚠ No test directory detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live AmrDeveloper/GQL repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/AmrDeveloper/GQL.

What it runs against: a local clone of AmrDeveloper/GQL — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in AmrDeveloper/GQL | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 46 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>AmrDeveloper/GQL</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of AmrDeveloper/GQL. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/AmrDeveloper/GQL.git
#   cd GQL
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of AmrDeveloper/GQL and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "AmrDeveloper/GQL(\\.git)?\\b" \\
  && ok "origin remote is AmrDeveloper/GQL" \\
  || miss "origin remote is not AmrDeveloper/GQL (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "crates/gitql-engine/src/engine.rs" \\
  && ok "crates/gitql-engine/src/engine.rs" \\
  || miss "missing critical file: crates/gitql-engine/src/engine.rs"
test -f "crates/gitql-parser/src/parser.rs" \\
  && ok "crates/gitql-parser/src/parser.rs" \\
  || miss "missing critical file: crates/gitql-parser/src/parser.rs"
test -f "crates/gitql-ast/src/lib.rs" \\
  && ok "crates/gitql-ast/src/lib.rs" \\
  || miss "missing critical file: crates/gitql-ast/src/lib.rs"
test -f "crates/gitql-core/src/values/mod.rs" \\
  && ok "crates/gitql-core/src/values/mod.rs" \\
  || miss "missing critical file: crates/gitql-core/src/values/mod.rs"
test -f "crates/gitql-engine/src/data_provider.rs" \\
  && ok "crates/gitql-engine/src/data_provider.rs" \\
  || miss "missing critical file: crates/gitql-engine/src/data_provider.rs"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 46 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~16d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/AmrDeveloper/GQL"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

GitQL is a SQL-like query language and in-memory engine built in Rust that lets you query .git repository data (commits, branches, tags, refs) using familiar SQL syntax with advanced features like GROUP BY, window functions, and aggregation. The project ships both as a CLI tool (gitql-cli) and a modular SDK (gitql-core, gitql-parser, gitql-engine) that allows extending the query engine with custom types, functions, and data providers. Monorepo with 6 specialized crates under crates/: gitql-ast defines the expression/query AST types (expression.rs, statement.rs, query.rs) and a rich type system in types/ ; gitql-parser converts text to AST; gitql-engine executes queries; gitql-core provides base traits; gitql-std supplies built-in functions; gitql-cli wraps the engine for terminal use. Build configuration in Cargo.toml with workspace.dependencies manages versions centrally.

👥Who it's for

Git developers and DevOps engineers who want to analyze repository metadata (commit history, author stats, branch structures) without writing custom scripts; SDK users building domain-specific query tools on non-git data sources who need a customizable SQL engine.

🌱Maturity & risk

Production-ready but actively developed: version 0.43.0 indicates pre-1.0 status, but the project has CI/CD workflows (ci.yaml, release.yaml, docs.yaml), comprehensive type system with 15+ built-in types, and structured crates. The workspace spans 6 interconnected crates with clear separation of concerns. Active maintenance is evident from the versioning and release infrastructure.

Low-to-moderate risk: single maintainer (AmrDeveloper) creates maintenance burden, but Rust's type system mitigates memory safety issues. Dependencies are well-curated (gix 0.80 for git operations, chrono for dates, regex), but pre-1.0 versioning suggests potential breaking changes across minor versions. No visible test suite in file structure is a red flag for regression testing.

Active areas of work

Cannot determine exact current work from file list alone, but CHANGELOG.md and RELEASING.md suggest active release cycles. The presence of comprehensive CI workflows (docs.yaml, release.yaml) indicates ongoing maintenance and documentation updates. Version progression (0.42 → 0.43) suggests incremental feature addition or polish.

🚀Get running

git clone https://github.com/AmrDeveloper/GQL.git
cd GQL
cargo build --release
cargo run --bin gitql-cli

Then pipe SQL queries: echo "SELECT * FROM commits" | ./target/release/gitql-cli

Daily commands: Dev mode: cargo build && cargo run --bin gitql-cli. Release mode: cargo build --release. Run tests: cargo test --workspace. Run benchmarks: cargo bench (uses criterion, config in benches/benchmarks.rs). REPL available via gitql-cli binary.

🗺️Map of the codebase

  • crates/gitql-engine/src/engine.rs — Core query execution engine that orchestrates parsing, evaluation, and result processing—the main entry point for all SQL-like queries on git data
  • crates/gitql-parser/src/parser.rs — Lexer and parser that transforms GitQL query strings into AST—critical for understanding how queries are tokenized and validated
  • crates/gitql-ast/src/lib.rs — Abstract syntax tree definitions for expressions, queries, and statements—foundational data structures used throughout the engine
  • crates/gitql-core/src/values/mod.rs — Runtime value system that represents all data types (integers, text, dates, composites) during query execution
  • crates/gitql-engine/src/data_provider.rs — Trait definition for pluggable data sources (git repos, files, etc.)—enables extensibility of GitQL to new data sources
  • crates/gitql-cli/src/lib.rs — CLI entry point that handles arguments, invokes the engine, and formats output—user-facing interface to the entire system
  • Cargo.toml — Workspace configuration defining all member crates (core, engine, parser, CLI, AST, stdlib) and shared dependencies

🛠️How to make changes

Add a Custom Data Source

  1. Define a new struct implementing the DataProvider trait from engine/src/data_provider.rs (crates/gitql-engine/src/data_provider.rs)
  2. Implement get_schema() to return column names and types for your data source (crates/gitql-core/src/schema.rs)
  3. Implement scan() to return an iterator of rows as Object instances (crates/gitql-core/src/object.rs)
  4. Register your provider in the engine's data source map before executing queries (crates/gitql-engine/src/engine.rs)

Add a Built-in Function

  1. Create a new function struct implementing the GitQLFunction trait in gitql-std (crates/gitql-std/src/lib.rs)
  2. Define the function signature (parameters, return type) in the function's signature() method (crates/gitql-core/src/signature.rs)
  3. Implement execute() to handle argument evaluation and return a Value result (crates/gitql-core/src/values/mod.rs)
  4. Register the function in the standard library initialization within gitql-std (crates/gitql-std/src/lib.rs)

Add Support for a New Query Clause

  1. Add a new AST node variant to represent the clause in gitql-ast/src/query.rs (crates/gitql-ast/src/query.rs)
  2. Update the parser in gitql-parser to recognize and parse the new syntax (crates/gitql-parser/src/parser.rs)
  3. Create a new module in gitql-engine/src/ (e.g., engine_newfeature.rs) with execution logic (crates/gitql-engine/src/engine.rs)
  4. Integrate the new executor into engine.rs execute() pipeline in the correct order (crates/gitql-engine/src/engine.rs)

Add a New Output Format

  1. Create a new printer struct in crates/gitql-cli/src/printer/yourformat_printer.rs (crates/gitql-cli/src/printer/mod.rs)
  2. Implement the Printer trait with print() method to serialize rows (crates/gitql-cli/src/printer/mod.rs)
  3. Register the new printer in the dispatcher based on CLI arguments (crates/gitql-cli/src/lib.rs)
  4. Add argument parsing support (e.g., --format=yourformat) in arguments.rs (crates/gitql-cli/src/arguments.rs)

🔧Why these technologies

  • Rust — Memory-safe, compiled performance critical for in-memory query engine processing large git repositories
  • gix (git library) — Pure Rust git implementation providing low-level access to .git file structures without shelling out to git CLI
  • Trait-based design (DataProvider, GitQLFunction) — Enables extensibility for custom data sources and user-defined functions without modifying core engine
  • dyn-clone for dynamic dispatch — Allows heterogeneous Value types and function overloads to coexist at runtime in collections

⚖️Trade-offs already made

  • In-memory query execution

    • Why: Simpler implementation, no disk I/O overhead for small-to-medium git repos
    • Consequence: Memory usage scales with dataset size; not suitable for massive repositories; no streaming/pipelined execution
  • Single-threaded execution pipeline

    • Why: Simplifies data flow and eliminates synchronization complexity
    • Consequence: Cannot parallelize across cores; slower on multi-core systems; good enough for typical git queries (<1min latency)
  • Dynamic typing at runtime via Value trait

    • Why: Mimics SQL's flexibility where types are determined at query time, supports user-defined types
    • Consequence: Runtime type errors instead of compile-time safety; overhead of dynamic dispatch on every operation
  • AST-based interpretation (no compilation to bytecode)

    • Why: Faster development cycle, simpler debugging, direct mapping to query semantics
    • Consequence: Slower execution than compiled bytecode; repeated traversal of same query tree in loops

🚫Non-goals (don't propose these)

  • Real-time query execution on live-changing repositories
  • Network-distributed query processing across multiple machines
  • ACID transaction support or write operations (read-only queries)
  • Query optimization via

🪤Traps & gotchas

No hidden env vars required — tool works on any .git directory. Gotcha: gix requires max-performance feature enabled in Cargo.toml for git operations; stripping this breaks blob diffing. Parser case-insensitivity: all SQL keywords are lowercased before parsing, which can surprise if working directly with AST. Type system flexibility: composite types and dynamic types in the type system mean runtime type errors are possible despite Rust's compile-time checks — validation happens in format_checker.rs.

🏗️Architecture

💡Concepts to learn

  • Abstract Syntax Tree (AST) — GitQL's entire architecture pivots on AST: parser converts SQL text to Expression/Statement enums (expression.rs, statement.rs), then engine evaluates the tree recursively; understanding AST structure is prerequisite for adding operators or language features
  • Type Coercion & Format Checking — gitql-ast/src/format_checker.rs validates that expressions produce correct types before execution (e.g., SUM only works on numeric columns); this is how a dynamically-typed query language achieves type safety in Rust
  • Trait-based Extensibility (Rust Patterns) — SDK allows custom types and functions via trait implementations; dyn-clone enables runtime polymorphism for user-defined types without knowing their concrete type at compile time
  • Window Functions — Listed as supported feature but undocumented; window functions (ROW_NUMBER, RANK, LAG/LEAD over PARTITION BY) allow advanced analytics on row sequences without explicit GROUP BY
  • Interval Types (Temporal Arithmetic) — gitql-ast/src/interval.rs implements SQL INTERVAL literals (INTERVAL '1 year 2 days'); native support for temporal math is rare in query engines and useful for git analysis (e.g., commits in last N months)
  • Git Object Graph (via gix) — GitQL queries .git files directly using gix library instead of spawning git CLI; understanding refs, commits, branches as git objects in the object database is implicit to the data provider design
  • Variadic Functions & Function Overloading — gitql-ast/src/types/varargs.rs enables functions with variable argument counts; allows single function name like COUNT(*) or SUM(col) without explicit overload signatures
  • stevehoover/1st-CLaaS — Alternative SQL-like DSL for git analysis, demonstrates domain-specific language design for version control data
  • libgit2/libgit2 — Lower-level git bindings; GitQL's data provider abstracts over gix (git2 alternative)
  • jqlang/jq — Query language precedent for composable, functional approach to data extraction (though jq targets JSON)
  • datafuselabs/databend — Another Rust SQL query engine emphasizing extensibility and custom types; architectural inspiration for pluggable data providers
  • gitpython/GitPython — Pythonic git interaction library; GitQL serves as compiled SQL alternative for scripting git data analysis

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive integration tests for SQL query execution across all data sources

The repo has a test infrastructure (benches/benchmarks.rs exists) but lacks visible integration tests for the query engine. With gitql-engine, gitql-parser, and gitql-ast crates handling SQL parsing and execution, there should be integration tests covering: SELECT/WHERE/GROUP BY/ORDER BY operations, window functions, user-defined types, and various git data sources. This ensures correctness as new features are added and prevents regressions.

  • [ ] Create crates/gitql-engine/tests/ directory with integration test files
  • [ ] Add tests for basic SELECT queries against git repositories
  • [ ] Add tests for GROUP BY and aggregation functions
  • [ ] Add tests for window functions (which are mentioned as supported)
  • [ ] Add tests for JOIN operations and complex WHERE conditions
  • [ ] Update CI workflow (crates/gitql-engine/Cargo.toml) to run these tests

Add missing type system tests for gitql-ast types module

The gitql-ast crate has extensive type definitions (crates/gitql-ast/src/types/ with 14+ type files: integer, float, text, date, datetime, time, array, composite, variant, etc.) but there are no visible unit tests validating type coercion, conversion, validation, and operations. This is critical for a type-safe query language to ensure type safety guarantees.

  • [ ] Create crates/gitql-ast/src/types/tests.rs or crates/gitql-ast/tests/types/
  • [ ] Add tests for type coercion (e.g., integer to float, text to date)
  • [ ] Add tests for Optional and Variant type handling
  • [ ] Add tests for Array and Composite type operations
  • [ ] Add tests for Interval and Range type arithmetic
  • [ ] Add tests for the interval.rs format checker validation

Implement structured diagnostics with source span information for parse errors

The crate has diagnostic_reporter.rs in gitql-cli but the parser errors likely lack precise source location tracking. Given the complexity of SQL parsing (gitql-parser crate), users need clear error messages showing exactly where in their query the error occurred with line/column numbers and visual indicators, similar to Rust's compiler diagnostics.

  • [ ] Update gitql-ast/src/lib.rs and expression.rs to track source span (line, column, length) in AST nodes
  • [ ] Enhance error types in gitql-parser to include Span information
  • [ ] Extend crates/gitql-cli/src/diagnostic_reporter.rs to display colored source context with error pointers
  • [ ] Add tests in crates/gitql-parser/tests/ validating error messages include correct spans
  • [ ] Document the error reporting format in CONTRIBUTING.md

🌿Good first issues

  • Add comprehensive unit tests for crates/gitql-ast/src/types/ — each type module (integer.rs, text.rs, date.rs, etc.) lacks visible test coverage despite being core to correctness
  • Document the custom type system by adding examples in README.md showing how to create a user-defined type via SDK (the composite.rs and dynamic.rs types are powerful but undocumented)
  • Add SQL examples for window functions (OVER, PARTITION BY, ROW_NUMBER) to README.md samples section — these are listed as supported but no examples shown

Top contributors

Click to expand

📝Recent commits

Click to expand
  • 3a76cfe — Migrate to the latest rand version (AmrDeveloper)
  • d9db510 — Updte GitQL & GitQL SDK Versions (AmrDeveloper)
  • a8ff934 — Support != operator between Raw expressions (AmrDeveloper)
  • ffe3825 — Support = operator between Raw expressions (AmrDeveloper)
  • 80cd9ca — Update gix to 0.80.0 (AmrDeveloper)
  • ab3ba7b — Fix out of index panic when groups length is zero (AmrDeveloper)
  • 117c3d3 — Migrate ti gix 0.78.0 (AmrDeveloper)
  • 04b0b3e — Handling optional commit author and committer (AmrDeveloper)
  • e77b943 — Migrate to gix 0.77.0 (AmrDeveloper)
  • 51be999 — Revise setup documentation (#148) (muzimuzhi)

🔒Security observations

  • High · Invalid Rust Edition in Cargo.toml — Cargo.toml (line: edition = "2024"). The Cargo.toml specifies edition = "2024" which does not exist. Valid Rust editions are 2015, 2018, and 2021. This is not a security vulnerability per se, but it will cause compilation failures and prevent the project from being built, potentially masking other security issues during development and CI/CD. Fix: Change edition to a valid Rust edition such as "2021" or "2018". For modern projects, "2021" is recommended.
  • Medium · Potential Command Injection via Git Query Language — crates/gitql-parser, crates/gitql-engine, crates/gitql-cli/src/arguments.rs. GitQL is a SQL-like query language that parses and executes queries on git repositories. While the codebase appears to use the 'gix' crate for git operations, there is a risk of command injection or query injection if user input is not properly validated and sanitized before being parsed and executed. The parser (gitql-parser crate) needs careful review for injection vulnerabilities. Fix: Implement strict input validation and sanitization for all user-provided queries. Use parameterized/prepared statement-like patterns where applicable. Conduct thorough security testing with fuzzing and injection attack payloads. Add input length limits and complexity limits to prevent DoS attacks.
  • Medium · Dependency Version Flexibility Without Pinning — Cargo.toml (workspace.dependencies section). The Cargo.toml uses workspace dependencies with flexible version constraints (e.g., gix = "0.80.0") without upper bounds, allowing minor and patch version updates. While this is common practice, it increases the risk of pulling in vulnerable transitive dependencies. No lockfile security auditing is evident in the CI/CD configuration shown. Fix: Run 'cargo audit' regularly in CI/CD pipelines to detect known vulnerabilities. Consider using 'cargo-deny' for stricter dependency governance. Implement Dependabot or similar automated dependency scanning. Review Cargo.lock regularly for security updates.
  • Medium · Unsafe Rust Usage Not Evident — crates/gitql-core/src, crates/gitql-engine (inferred). The file structure does not show explicit security-critical operations, but the use of the 'gix' crate for git operations and file parsing could involve unsafe code. Without reviewing the actual implementation, potential memory safety issues from unsafe blocks cannot be ruled out. Fix: Audit all unsafe code blocks in the codebase. Minimize unsafe code usage. Add inline documentation explaining why unsafe is necessary. Use tools like 'cargo-clippy' and 'miri' for additional safety checks. Consider using safe alternatives where possible.
  • Medium · File System Access Without Explicit Security Boundaries — crates/gitql-cli/src/arguments.rs, crates/gitql-core/src. The CLI tool (gitql-cli) accepts arbitrary repository paths and performs git operations on them. There is potential for path traversal or unauthorized access to sensitive files if input validation is insufficient. Fix: Implement strict path validation using canonicalization. Restrict access to only git repositories within allowed directories. Validate that paths resolve to expected locations. Consider implementing a sandboxing mechanism or privilege separation if processing untrusted repositories.
  • Low · Missing Security Policy — Repository root (missing SECURITY.md). The repository structure shows CONTRIBUTING.md and CODE_OF_CONDUCT.md but no SECURITY.md file for reporting security vulnerabilities. This makes it difficult for security researchers to responsibly disclose vulnerabilities. Fix: Create a SECURITY.md file with instructions for responsible vulnerability disclosure, including a private reporting mechanism and expected response timeline. Reference it in the README.
  • Low · Output Format Injection Risk — crates/gitql-cli/src/printer/csv_printer.rs, crates/gitql-cli/src/printer/json_printer.rs, crates/gitql-cli/src/printer/yaml_printer.rs. The CLI supports multiple output formats (CSV, JSON, YAML, Table) via gitql-cli/src/printer modules. Improper escaping or formatting could lead to injection attacks depending on how output is serialized. Fix: Ensure all output serializers use well-established, maintained

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Mixed signals · AmrDeveloper/GQL — RepoPilot