filhodanuvem/gitql

Item: filhodanuvem/gitql
Rating: 5
Author: RepoPilot

💊 A git query language

Healthy

Healthy across all four use cases

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓12 active contributors
✓Distributed ownership (top contributor 46% of recent commits)
✓MIT licensed

Show all 6 evidence items →

✓CI configured
✓Tests present
⚠Stale — last commit 2y ago

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/filhodanuvem/gitql)](https://repopilot.app/r/filhodanuvem/gitql)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/filhodanuvem/gitql on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: filhodanuvem/gitql

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/filhodanuvem/gitql shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

12 active contributors
Distributed ownership (top contributor 46% of recent commits)
MIT licensed
CI configured
Tests present
⚠ Stale — last commit 2y ago

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live filhodanuvem/gitql repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/filhodanuvem/gitql.

What it runs against: a local clone of filhodanuvem/gitql — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in filhodanuvem/gitql | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 600 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>filhodanuvem/gitql</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of filhodanuvem/gitql. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/filhodanuvem/gitql.git
#   cd gitql
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of filhodanuvem/gitql and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "filhodanuvem/gitql(\\.git)?\\b" \\
  && ok "origin remote is filhodanuvem/gitql" \\
  || miss "origin remote is not filhodanuvem/gitql (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "main.go" \\
  && ok "main.go" \\
  || miss "missing critical file: main.go"
test -f "lexical/lexical.go" \\
  && ok "lexical/lexical.go" \\
  || miss "missing critical file: lexical/lexical.go"
test -f "parser/parser.go" \\
  && ok "parser/parser.go" \\
  || miss "missing critical file: parser/parser.go"
test -f "runtime/runtime.go" \\
  && ok "runtime/runtime.go" \\
  || miss "missing critical file: runtime/runtime.go"
test -f "semantical/semantical.go" \\
  && ok "semantical/semantical.go" \\
  || miss "missing critical file: semantical/semantical.go"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 600 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~570d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/filhodanuvem/gitql"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Gitql is a SQL-like query language and compiler/interpreter for Git repositories, allowing users to query commit history, authors, dates, and messages using SELECT statements instead of git log flags. It lexically parses SQL queries, builds an AST, performs semantic validation, and executes against live Git data via go-git/v5 without requiring a separate database. Key capability: select hash, author, message from commits where author = 'cloudson' limit 3 returns filtered commit data in tabular format. Monolithic single-binary structure: lexical/ (tokenizer/lexer state machine), parser/ (SQL AST builder), semantical/ (type checking), runtime/ (Git execution via go-git, visitor pattern for query execution), utilities/ (helpers). main.go ties it together with CLI via urfave/cli; autocomplete.go adds readline REPL. No separate packages—everything in repo root or subdirs.

👥Who it's for

Git power users and DevOps engineers who want to analyze repository metadata (commit patterns, author activity, code change history) using familiar SQL syntax rather than learning complex git log options and grep chains. Also appeals to developers building tooling around Git analytics.

🌱Maturity & risk

Experimental but functional. The README explicitly warns this is the author's first Go project and recommends it NOT as a code guideline. CI/CD is set up (.github/workflows/ci.yml, tag.yml) with Go 1.20 support and goreleaser config. No visible public issue backlog or recent commits shown in metadata, suggesting low active maintenance. Production-ready for read-only queries but not heavily battle-tested.

Single-maintainer project (cloudson/gitql) with no visible recent activity metrics in provided data. Dependency surface is moderate (~20 transitive deps via go-git and CLI libs), but go-git is well-maintained. Main risk: SQL parser/interpreter complexity in a self-taught codebase could hide edge cases; semantic validation (semantical/) is minimal. Breaking changes unlikely since it's read-only and stable, but feature requests may languish.

Active areas of work

Not visible from provided metadata (no recent commit hashes, PR list, or milestones shown). Repo appears stable/dormant rather than actively developed. CI workflows are in place, suggesting maintenance mode rather than active feature work.

🚀Get running

git clone https://github.com/filhodanuvem/gitql.git
cd gitql
go build .
./gitql "select hash, author, message from commits limit 3"

Daily commands: go run main.go "your query" or go build . && ./gitql "your query". REPL mode: run ./gitql with no args (uses readline for history). Queries default to 10-row limit; examples in README.

🗺️Map of the codebase

main.go — Entry point that initializes the CLI and orchestrates the query execution pipeline.
lexical/lexical.go — Tokenizes raw query strings into lexemes; foundational to parsing any GitQL query.
parser/parser.go — Converts token stream into AST; core compiler stage that all queries must pass through.
runtime/runtime.go — Executes AST against git repository; bridges parsed queries to actual git operations.
semantical/semantical.go — Validates AST semantic correctness before execution; prevents invalid queries from running.
runtime/commits.go — Implements commit filtering and querying logic; handles the most common data source in GitQL.

🧩Components & responsibilities

Lexical Analyzer (lexical/lexical.go, lexemes.go, states.go) — Converts raw query string into stream of recognized tokens (keywords, identifiers, operators).
- Failure mode: Unrecognized token → parse error before AST construction
Parser (parser/parser.go, ast.go) — Builds Abstract Syntax Tree from token stream using recursive descent; enforces grammar.
- Failure mode: Syntax error (missing clause, wrong token order) → parse exception
Semantic Validator (semantical/semantical.go, visitor.go) — Walks AST to verify table names, column existence, type compatibility, function signatures.
- Failure mode: Invalid column reference, type mismatch → semantic error before execution
Runtime Executor (runtime/runtime.go, visitor.go, commits.go) — Evaluates validated AST against git repository; applies WHERE filters, ORDER BY, formats results.
- Failure mode: I/O error reading git objects → runtime exception; queries on nonexistent repo fail
Git Data Provider (runtime/commits.go, reference.go, remotes.go, go-git v5.6.1) — Fetches commits, branches, tags, remotes from .git directory using go-git library.
- Failure mode: Corrupted git repo, missing .git → go-git panics or returns empty results
CLI Interface — Parses command-line arguments, routes to runtime, formats and displays output.

🛠️How to make changes

Add a new queryable table/data source

Define the new data structure and retrieval method in runtime/ (e.g., runtime/branches.go) (runtime/branches.go)
Register the table in semantical/visitor.go to validate column access (semantical/visitor.go)
Add execution logic in runtime/visitor.go to handle the new table in FROM clauses (runtime/visitor.go)
Write integration tests in test/select.bats to verify the new table works (test/select.bats)

Add a new SQL clause (e.g., GROUP BY, HAVING)

Add lexeme tokens for the new keyword in lexical/lexemes.go (lexical/lexemes.go)
Extend the parser to recognize and build AST nodes for the clause in parser/parser.go (parser/parser.go)
Define AST node types in parser/ast.go to represent the clause structure (parser/ast.go)
Add semantic validation logic in semantical/visitor.go (semantical/visitor.go)
Implement execution logic in runtime/visitor.go to apply the clause (runtime/visitor.go)

Add a new filter/comparison operator

Define the operator token in lexical/lexemes.go (lexical/lexemes.go)
Update the parser to recognize the operator in expressions in parser/parser.go (parser/parser.go)
Implement the comparison logic in runtime/visitor.go where WHERE clause evaluation occurs (runtime/visitor.go)
Add test cases in test/select.bats demonstrating the operator (test/select.bats)

🔧Why these technologies

Go 1.20+ — Compiled language with excellent CLI tooling and git library ecosystem; strong performance for text processing.
go-git (v5.6.1) — Pure Go git implementation; enables querying git repositories without requiring git binary on system.
chzyer/readline — Provides interactive REPL-like experience with history and line editing for shell-like interface.
olekukonko/tablewriter — Formats query results as aligned ASCII tables suitable for terminal output.
urfave/cli/v2 — Standardized Go CLI framework for flag parsing and command routing.

⚖️Trade-offs already made

Compiler architecture: Lexer → Parser → Semantic Validator → Runtime Executor
- Why: Clear separation of concerns enables maintainability and debugging at each stage; type safety via AST.
- Consequence: Multi-stage pipeline has overhead but catches errors early; not designed for real-time streaming.
Pure Go implementation with go-git instead of shelling to git binary
- Why: Eliminates external dependency; enables cross-platform binary distribution.
- Consequence: Limited to go-git's capabilities; may lag behind git CLI features; all operations in-process (no parallelism with OS threads).
Terminal table output with tablewriter instead of JSON/streaming
- Why: Human-friendly display for interactive CLI use.
- Consequence: Not optimized for piping to other tools; large result sets consume memory before display.

🚫Non-goals (don't propose these)

Real-time query streaming or lazy evaluation
Multi-threaded parallel query execution across multiple repos
Full SQL compliance (subset of SQL tailored to git data model)
Query optimization or query planner
Authentication or permission checking for private repos
Persistent caching or indexing of git history
Network/remote repository querying without local clone

🪤Traps & gotchas

No explicit config: gitql operates on the current working directory as a Git repo; will fail silently if pwd is not a git repo. Lexical/parser coupling: Tokens defined in lexical/tokens.go must be registered in lexical/lexemes.go state machine or queries won't tokenize; easy to add a keyword and forget the state entry. Runtime assumes valid AST: semantical validation is minimal; malformed AST from parser could panic in runtime/visitor.go. REPL readline history: lives in ~/.gitql_history by default (inferred from readline usage); not configurable. Order of operations: semantic pass (semantical/) is separate from runtime (runtime/); changes to AST structure require updates in both places.

🏗️Architecture

💡Concepts to learn

Lexical Analysis / Tokenization — Gitql's lexical/ package implements a finite state machine to break SQL strings into tokens; understanding state transitions in lexical/states.go is essential to extending the query language
Abstract Syntax Tree (AST) — Parser builds an AST (defined in parser/ast.go) that represents query structure; runtime/visitor.go walks this tree to execute queries, so AST design directly impacts what queries are possible
Visitor Pattern — Gitql uses the Visitor pattern in runtime/visitor.go to traverse the AST and execute queries without modifying AST node definitions; critical for separating structure from execution logic
Semantic Analysis — semantical/semantical.go validates that a syntactically correct query makes sense (columns exist, types match) before runtime; prevents invalid queries from crashing the interpreter
Finite State Machine (FSM) — Gitql's lexer is an FSM that transitions between states (e.g., IN_STRING, IN_NUMBER) in lexical/states.go; understanding state transitions is necessary to fix or extend tokenization
Plumbing vs Porcelain (Git terminology) — Gitql uses go-git library which wraps Git plumbing (low-level object/ref operations); knowing this distinction helps debug commits.go and remotes.go when repository access behaves unexpectedly
Query Compilation vs Interpretation — Gitql is a compiler/interpreter (README emphasizes this distinction over SQLite); lexical→parser→semantical→runtime pipeline is compilation; understanding this justifies the architecture choice over using a DB

dinedal/textql — Direct inspiration for gitql per README; same paradigm (SQL against non-SQL sources without intermediate DB)
go-git/go-git — The Git library gitql depends on for all repository access; understanding its API is critical for extending data sources
github/gitignore — Companion resource; users often query commits to analyze gitignore changes or excluded files
cli/cli — Alternative Git query tool (GitHub CLI) with similar goal of replacing complex git flags; competitive/related ecosystem
src-d/go-git — Historical predecessor to go-git/go-git; gitql could have used this but now depends on modern go-git/v5

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive test coverage for semantical/semantical.go

The semantical package validates AST nodes and likely contains complex logic for type checking and semantic analysis of GitQL queries. The semantical_test.go file exists but given the complexity of a query language, there are likely edge cases around invalid column references, type mismatches, and unsupported table operations that lack test coverage. This would improve reliability before queries reach runtime.

[ ] Review semantical/semantical.go to identify untested code paths (invalid column names, type errors, unsupported operations)
[ ] Add test cases in semantical/semantical_test.go for edge cases like querying non-existent tables, using invalid field names, and type coercion errors
[ ] Run coverage tools to verify new tests achieve >80% coverage for the semantical package
[ ] Document the test scenarios in comments for future maintainers

Add integration tests for complex multi-table queries in test/select.bats

The test directory uses BATS (Bash Automated Testing System) for end-to-end testing. The select.bats file likely tests basic SELECT queries, but GitQL supports JOIN operations and complex filtering across git tables (commits, references, remotes in runtime/). There are probably no tests for multi-table queries or advanced WHERE clause combinations that span multiple git objects.

[ ] Review existing test/select.bats to understand current query test patterns
[ ] Add BATS test cases for JOIN queries combining commits with references
[ ] Add test cases for complex WHERE clauses filtering across commit author, date ranges, and ref names
[ ] Add test cases for aggregate operations or GROUP BY if supported by the language (check tables.md)
[ ] Document the new test scenarios with clear descriptions of what GitQL behavior they validate

Add GitHub Actions workflow for Go dependency vulnerability scanning

The repo has ci.yml and tag.yml workflows but lacks automated scanning for vulnerable dependencies. With 24 transitive dependencies (some from 2018-2020), there's risk of using packages with known CVEs. Adding Dependabot alerts is configured (.github/dependabot.yml exists) but no active scanning workflow exists to block CI on high-severity vulnerabilities.

[ ] Create .github/workflows/security.yml with a job that runs 'go list -json -m all | nancy sleuth' or 'go list -m all' with nancy for dependency scanning
[ ] Alternatively, use gosec for Go security scanning and configure it to check for crypto/unsafe patterns in runtime/ and parser/ packages
[ ] Configure the workflow to fail the build on high-severity vulnerabilities
[ ] Add a README section documenting how contributors can run security checks locally before submitting PRs
[ ] Consider updating outdated dependencies (chzyer/readline from 2016, some transitive deps from 2018) as a follow-up

🌿Good first issues

Add tests for runtime/remotes.go (file exists, referenced in go.mod for .gitmodules but has no .gitmodules or remotes_test.go visible)—implement git remote querying with test coverage.
Expand semantical/semantical_test.go; currently no test cases for invalid column names, type mismatches in WHERE clauses, or invalid table references—add negative test cases.
Add time-based functions to lexical/tokens.go and runtime/visitor.go (e.g., where date > NOW() - 30 days)—currently only literal date comparisons work; extend to relative date parsing.

⭐Top contributors

Click to expand

@dependabot[bot] — 46 commits
@filhodanuvem — 23 commits
[@Claudson Oliveira](https://github.com/Claudson Oliveira) — 16 commits
@sesam — 3 commits
@shadowspore — 3 commits

📝Recent commits