xataio/pgroll
PostgreSQL zero-downtime migrations made easy
Healthy across the board
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 3mo ago
- ✓8 active contributors
- ✓Distributed ownership (top contributor 41% of recent commits)
Show all 6 evidence items →Show less
- ✓Apache-2.0 licensed
- ✓CI configured
- ✓Tests present
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/xataio/pgroll)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/xataio/pgroll on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: xataio/pgroll
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/xataio/pgroll shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across the board
- Last commit 3mo ago
- 8 active contributors
- Distributed ownership (top contributor 41% of recent commits)
- Apache-2.0 licensed
- CI configured
- Tests present
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live xataio/pgroll
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/xataio/pgroll.
What it runs against: a local clone of xataio/pgroll — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in xataio/pgroll | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 109 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of xataio/pgroll. If you don't
# have one yet, run these first:
#
# git clone https://github.com/xataio/pgroll.git
# cd pgroll
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of xataio/pgroll and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "xataio/pgroll(\\.git)?\\b" \\
&& ok "origin remote is xataio/pgroll" \\
|| miss "origin remote is not xataio/pgroll (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "cmd/root.go" \\
&& ok "cmd/root.go" \\
|| miss "missing critical file: cmd/root.go"
test -f "cmd/migrate.go" \\
&& ok "cmd/migrate.go" \\
|| miss "missing critical file: cmd/migrate.go"
test -f "cmd/start.go" \\
&& ok "cmd/start.go" \\
|| miss "missing critical file: cmd/start.go"
test -f "cmd/status.go" \\
&& ok "cmd/status.go" \\
|| miss "missing critical file: cmd/status.go"
test -f "cmd/rollback.go" \\
&& ok "cmd/rollback.go" \\
|| miss "missing critical file: cmd/rollback.go"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 109 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~79d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/xataio/pgroll"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
pgroll is a CLI tool for zero-downtime, reversible schema migrations in PostgreSQL that serves multiple schema versions simultaneously, allowing breaking changes without downtime. It handles the complex operations needed to keep old and new application versions working against the database at the same time, using techniques like shadow tables and triggers to avoid locking and data loss. Monolithic CLI application: cmd/ directory contains command handlers (migrate.go, rollback.go, validate.go, status.go, etc.), pkg/ likely contains the core migration engine and schema analysis logic, with dev/benchmark-results/ for performance regression testing. The Dockerfile and Makefile suggest containerized deployment and local development workflows.
👥Who it's for
Backend engineers and DevOps teams managing production PostgreSQL databases who need to deploy schema changes without downtime or rolling back deployments. Specifically useful for organizations running applications on Postgres 14+ (including AWS RDS/Aurora) that require zero-downtime deploys and instant rollback capabilities.
🌱Maturity & risk
Actively developed and production-ready: the project has comprehensive GitHub Actions CI/CD (build.yml, benchmark.yaml, docs_preview.yaml), clear Apache 2.0 licensing, cross-platform binary distribution via goreleaser, and substantial Go codebase (1.2M lines). However, as an open-source tool from Xata, community adoption and enterprise support should be verified before critical use.
Low-to-moderate risk: the codebase is well-structured with a focused mission (PostgreSQL migrations only), has minimal direct dependencies (spf13/cobra for CLI is stable), and includes benchmarking infrastructure. Main risk is operational: migrations are inherently high-stakes database operations, so thorough testing in staging is critical. Single-source dependency on PostgreSQL version compatibility (14.0+) requires careful version pinning.
Active areas of work
Active maintenance indicated by recent CI workflow updates (.github/workflows structure), benchmarking pipeline setup, and documentation preview jobs. The presence of cmd/analyze.go, cmd/convert.go, and cmd/pull.go suggests ongoing feature expansion around migration analysis and schema synchronization. No specific issue backlog visible in provided data, but golangci.yml suggests code quality is actively managed.
🚀Get running
git clone https://github.com/xataio/pgroll.git
cd pgroll
make
The Makefile is the primary build tool. Requires Go 1.24+ based on go.mod patterns. For CLI usage: ./pgroll --help to explore available commands documented in cmd/ directory.
Daily commands:
make build
./pgroll migrate --help
Likely requires a running PostgreSQL database accessible via connection string (--postgres-url or env var). See cmd/root.go and cmd/flags/flags.go for environment variable and flag definitions. Local testing probably uses docker-compose or similar (see Dockerfile).
🗺️Map of the codebase
cmd/root.go— Entry point for the CLI application; all pgroll commands are registered and executed through this root commandcmd/migrate.go— Core migration execution command that orchestrates schema changes; implements the primary happy-path workflowcmd/start.go— Initiates a new migration workflow; establishes the versioning and state management foundation for all migrationscmd/status.go— Queries current migration state and validates schema consistency; critical for observability and debuggingcmd/rollback.go— Implements zero-downtime rollback mechanism; essential for understanding pgroll's core safety guaranteesMakefile— Build and test orchestration; defines compilation targets and development workflows for the entire project.goreleaser.yaml— Release automation and artifact publishing configuration; manages distribution of pgroll binaries across platforms
🛠️How to make changes
Add a new CLI command
- Create a new file in
cmd/directory following the naming pattern (e.g.,cmd/newcommand.go) (cmd/newcommand.go) - Define a new cobra Command struct with RunE function that implements the command logic (
cmd/newcommand.go) - Add command flags to
cmd/flags/flags.goif the command needs database connection or other options (cmd/flags/flags.go) - Register the command in
cmd/root.goby addingrootCmd.AddCommand(newCommandCmd)in the init function (cmd/root.go) - Create documentation in
docs/cli/newcommand.mdxwith usage examples and flag descriptions (docs/cli/newcommand.mdx)
Add a new migration operation type
- Define the operation struct and logic in a new file under the operations layer (referenced by create/validate commands) (
cmd/create.go) - Add validation rules for the operation in
cmd/validate.go(cmd/validate.go) - Implement SQL generation and execution in the operation's Execute method (
cmd/migrate.go) - Create documentation in
docs/operations/operationname.mdxwith examples and compatibility notes (docs/operations/operationname.mdx)
Modify error handling or add error types
- Define new error types or error handling logic in
cmd/errors.go(cmd/errors.go) - Use the error types in command implementations (e.g.,
cmd/migrate.go,cmd/start.go) (cmd/migrate.go) - Update error output formatting to provide clear guidance to users on resolution (
cmd/errors.go)
Add a new build or release target
- Define new build targets or dependencies in
Makefile(Makefile) - For release artifacts, add configuration to
.goreleaser.yamlfor new platforms or formats (.goreleaser.yaml) - Update CI/CD workflows in
.github/workflows/build.ymlif new build steps are needed (.github/workflows/build.yml)
🔧Why these technologies
- Go/Cobra — Statically compiled binary enables zero-dependency CLI distribution; Cobra provides battle-tested command framework with subcommand routing
- PostgreSQL — pgroll is specifically engineered for PostgreSQL's advanced features (triggers, views, constraints) to enable safe zero-downtime migrations
- File-based migration definitions (YAML/JSON) — Declarative operations enable validation, analysis, and safe rollback planning before execution; human-readable format supports version control workflows
- Docker Compose (dev environment) — Provides isolated local PostgreSQL instance for safe testing without affecting production or host databases
⚖️Trade-offs already made
-
Semantic operations vs. raw SQL
- Why: Raw SQL is more flexible but less safe; pgroll trades flexibility for analysis capability and automatic rollback support
- Consequence: Some complex migrations may require raw_sql operations or workarounds; schema analysis is more robust
-
File-based state vs. centralized server
- Why: File-based state (in PostgreSQL migrations table) avoids external service dependency; simpler operational model for users
- Consequence: Concurrent migrations must be manually serialized; no built-in distributed locking mechanism
-
Single binary CLI vs. client-server architecture
- Why: Single binary reduces operational overhead and enables easy CI/CD integration; no service to manage
- Consequence: All logic runs during migration; no async job processing for long-running operations
🚫Non-goals (don't propose these)
- Real-time schema synchronization across multiple PostgreSQL instances
- Support for non-PostgreSQL databases
- GUI or web interface for migration management
- Authentication and access control (relies on PostgreSQL's native auth)
- Automatic detection and suggestion of safe migrations from SQL statements
- Transaction management across multiple databases
🪤Traps & gotchas
Postgres version constraint (14.0+) is hard requirement—migrations will fail against older versions silently if not validated upfront. Connection string is critical and likely required as --postgres-url flag or DATABASE_URL env var (check cmd/flags/flags.go). Shadow table naming and trigger naming conventions may conflict with existing schema—pgroll likely uses a prefix (check cmd/create.go for schema naming patterns). Migration files format and versioning scheme must be learned from cmd/init.go or examples. Testing migrations locally requires full PostgreSQL instance with write access.
🏗️Architecture
💡Concepts to learn
- Shadow tables — pgroll's core technique—new schema version lives in shadow tables alongside old schema, allowing dual-version compatibility without downtime; understanding shadow table lifecycle is essential to understanding how migrations work
- Trigger-based schema multiplexing — pgroll uses database triggers to keep old and new schema versions in sync during migration; critical for understanding how breaking changes are handled and why both versions stay consistent
- Reversible migrations — pgroll allows instant rollback by maintaining bidirectional migration state; understanding how down-migrations are tracked separately from up-migrations prevents data loss in production rollbacks
- Lock-free schema changes — PostgreSQL ALTER TABLE operations can lock tables; pgroll avoids this by using shadow tables and backfilling, so you must understand when and why traditional DDL is avoided
- Application-level schema versioning — pgroll allows applications to request old or new schema versions simultaneously, requiring coordination between client app and database; affects how migration windows and client rollout are planned
- Column backfilling strategies — When adding new columns with defaults or computed values, pgroll must backfill existing rows without blocking; understanding backfill chunking and timing is critical for large table migrations
🔗Related repos
flyway/flyway— Leading database migration tool for SQL/Java; pgroll targets same problem space but with zero-downtime focus for PostgreSQL specificallypressly/goose— Lightweight Go-based database migration library; simpler alternative to pgroll but lacks zero-downtime and shadow table orchestrationgolang-migrate/migrate— General-purpose database migration tool supporting multiple databases; pgroll is PostgreSQL-focused with advanced schema versioningxataio/xata— Parent organization's main product (Xata serverless database); pgroll is the migration component extracted as open-source utility for PostgreSQL usersjackc/pgx— Pure Go PostgreSQL driver; pgroll likely uses this or similar for database connections and query execution
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add missing CLI documentation files for rollback, status, and validate commands
The docs/cli directory has documentation for baseline, complete, convert, create, init, latest, migrate, and pull commands, but is missing .mdx files for rollback.mdx, status.mdx, and validate.mdx commands that clearly exist in cmd/ directory (cmd/rollback.go, cmd/status.go, cmd/validate.go). This creates a documentation gap for users trying to understand these important operations.
- [ ] Review cmd/rollback.go, cmd/status.go, and cmd/validate.go to understand command purpose and flags
- [ ] Create docs/cli/rollback.mdx following the pattern of existing docs/cli/*.mdx files (e.g., docs/cli/migrate.mdx)
- [ ] Create docs/cli/status.mdx with examples and flag documentation
- [ ] Create docs/cli/validate.mdx with examples and flag documentation
- [ ] Update docs/cli/README.md to include links to the three new command documentation pages
Add comprehensive unit tests for cmd/errors.go error handling
The cmd/errors.go file handles error formatting and reporting for the CLI, but there are no corresponding test files visible in the cmd directory. Error handling is critical for user experience, and testing various error scenarios (validation errors, database errors, migration failures) would improve reliability and prevent regressions.
- [ ] Create cmd/errors_test.go with test cases for each error type and formatting function in cmd/errors.go
- [ ] Add tests for error message formatting, wrapping, and user-facing output
- [ ] Test edge cases like empty errors, deeply nested errors, and context-specific error messages
- [ ] Ensure test coverage reaches at least 80% for cmd/errors.go using 'go test -cover ./cmd'
Add GitHub Actions workflow for linting documentation and checking broken links in docs/
The repo has workflows for build, benchmarks, and publish, but no workflow to validate the markdown/MDX documentation in docs/cli/. With growing CLI documentation, a workflow could catch broken internal links, syntax errors in .mdx files, and ensure consistency before merging documentation PRs. This is especially valuable given the structured docs pattern with cross-references.
- [ ] Create .github/workflows/docs-lint.yaml workflow that runs on PRs touching docs/ files
- [ ] Add markdown linting step using a tool like markdownlint-cli or similar for .mdx files
- [ ] Add a broken link checker step (e.g., using markdown-link-check) to validate internal references between docs/cli/*.mdx files
- [ ] Configure the workflow to fail if linting or link checks fail, preventing documentation regressions
- [ ] Test workflow locally or on a draft PR to ensure it catches issues
🌿Good first issues
- Add validation for PostgreSQL version constraint (14.0+) in cmd/validate.go or cmd/init.go and return user-friendly error message instead of cryptic SQL errors
- Write integration tests for cmd/rollback.go to verify that rolling back a migration with multiple schema changes correctly reverts all shadow tables and triggers (likely missing coverage for complex rollback scenarios)
- Add --dry-run flag to cmd/migrate.go and cmd/rollback.go to show planned operations without applying them (useful for previewing migrations in high-stakes environments, likely a commonly requested feature)
⭐Top contributors
Click to expand
Top contributors
- @andrew-farries — 41 commits
- @kvch — 33 commits
- @dependabot[bot] — 21 commits
- @0xflotus — 1 commits
- @nbarbettini — 1 commits
📝Recent commits
Click to expand
Recent commits
0a64836— Eliminate multiple connection leaks (#1022) (kvch)6b3d4a3— Bump golang.org/x/crypto from 0.37.0 to 0.45.0 (#1002) (dependabot[bot])aee9741— Fixcreate_indexoperation column ordering (#1009) (andrew-farries)1a3880f— Add--expect-oneflag tomigratecommand (#1005) (andrew-farries)07bb247— Bump golang.org/x/tools from 0.37.0 to 0.38.0 (#986) (dependabot[bot])f237c88— chore: fix small typo in orms.mdx (#992) (0xflotus)d436c48— Correct formatting and links in orms.mdx (#990) (nbarbettini)82cb94e— Add and update badges onREADME.md(#991) (andrew-farries)dab432e— Run CI tests against Postgres 18 (#983) (andrew-farries)dc397a2— Improve error message on binary/state mismatch (#982) (andrew-farries)
🔒Security observations
The pgroll codebase shows moderate security posture. Critical concerns include hardcoded credentials in docker-compose.yml and exposed database ports in development configuration. The Docker image setup is minimal but lacks binary verification. Dependency management appears standard for Go projects but could benefit from explicit security scanning integration. The most significant risks are development configuration artifacts that could accidentally propagate to production. Recommended priority: (1) Remove/externalize hardcoded credentials, (2) Bind database to localhost only, (3) Add security scanning to CI/CD, (4) Verify SQL query safety
- High · Hardcoded Database Credentials in docker-compose.yml —
docker-compose.yml (lines with POSTGRES_USER and POSTGRES_PASSWORD environment variables). The docker-compose.yml file contains hardcoded PostgreSQL credentials (POSTGRES_USER=postgres, POSTGRES_PASSWORD=postgres). This is a development configuration that should never be used in production and violates the principle of secret management. Fix: Use environment variables or a secrets management system. Replace hardcoded credentials with references like ${POSTGRES_PASSWORD} and load from .env files (which should be in .gitignore) or CI/CD secret management. - High · Insecure Docker Image Base —
Dockerfile. The Dockerfile uses 'FROM scratch' which is correct for minimal attack surface, but the binary (pgroll) is copied without any verification. There's no signature verification, hash verification, or build provenance checks visible. Fix: Implement binary verification using checksums or signatures. Consider using a multi-stage build to verify the binary integrity before copying it to the final image. Document the build provenance. - Medium · Exposed PostgreSQL Port in Development —
docker-compose.yml (ports section for db service). The docker-compose.yml exposes PostgreSQL port 5432 to all interfaces (0.0.0.0:5432:5432). While this is common in development, it could be accidentally used in production, exposing the database to network attacks. Fix: For development: bind to localhost only (127.0.0.1:5432:5432). For production: use proper network isolation and never expose databases directly. Document that this is development-only configuration. - Medium · Missing Health Check Timeout Configuration —
docker-compose.yml (healthcheck interval field). The health check in docker-compose.yml has an extremely long interval (24h), which means the container health won't be properly monitored in production. Combined with 'restart: always', failures may take 24 hours to be detected. Fix: Set reasonable healthcheck intervals for production (e.g., interval: 30s). The 24h interval appears to be a placeholder and should be reviewed for the actual use case. - Low · Dependency Version Pinning Not Fully Specified —
dev/go.mod (all dependency declarations). The dev/go.mod file uses non-pinned major versions (e.g., 'v1.8.1', 'v2.4.5'). While patch versions are specified, this could still lead to unexpected minor version updates that might introduce vulnerabilities. Fix: Use explicit version pinning and regularly update dependencies. Implement dependency scanning tools (e.g., 'go list -m all', Dependabot) in CI/CD to detect vulnerable dependencies. - Low · No SAST/Security Scanning Configuration Visible —
.golangci.yml. While .golangci.yml exists, there's no evidence of gosec (Go security scanner) or similar tools configured for detecting common Go security issues like hardcoded credentials, SQL injection patterns, or crypto weaknesses. Fix: Add gosec configuration to .golangci.yml. Include security scanning in GitHub Actions workflows (already present in .github/workflows/build.yml - verify it includes security checks). - Low · SQL Injection Risk - No Visible Parameterization —
cmd/*.go files (particularly those handling SQL queries). The project handles PostgreSQL migrations. File structure shows cmd/analyze.go, cmd/create.go, etc. Without access to the source code, common SQL injection vulnerabilities in migration tools should be verified (especially in query building and schema analysis). Fix: Ensure all database queries use parameterized statements. Implement input validation for migration definitions. Add security-focused unit tests for SQL query construction.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.