RepoPilotOpen in app →

DatabaseCleaner/database_cleaner

Strategies for cleaning databases in Ruby. Can be used to ensure a clean state for testing.

Healthy

Healthy across all four use cases

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 10mo ago
  • 20 active contributors
  • Distributed ownership (top contributor 45% of recent commits)
Show 4 more →
  • MIT licensed
  • CI configured
  • Tests present
  • Slowing — last commit 10mo ago

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Healthy
[![RepoPilot: Healthy](https://repopilot.app/api/badge/databasecleaner/database_cleaner)](https://repopilot.app/r/databasecleaner/database_cleaner)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/databasecleaner/database_cleaner on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: DatabaseCleaner/database_cleaner

Generated by RepoPilot · 2026-05-10 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/DatabaseCleaner/database_cleaner shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

  • Last commit 10mo ago
  • 20 active contributors
  • Distributed ownership (top contributor 45% of recent commits)
  • MIT licensed
  • CI configured
  • Tests present
  • ⚠ Slowing — last commit 10mo ago

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live DatabaseCleaner/database_cleaner repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/DatabaseCleaner/database_cleaner.

What it runs against: a local clone of DatabaseCleaner/database_cleaner — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in DatabaseCleaner/database_cleaner | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 336 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>DatabaseCleaner/database_cleaner</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of DatabaseCleaner/database_cleaner. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/DatabaseCleaner/database_cleaner.git
#   cd database_cleaner
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of DatabaseCleaner/database_cleaner and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "DatabaseCleaner/database_cleaner(\\.git)?\\b" \\
  && ok "origin remote is DatabaseCleaner/database_cleaner" \\
  || miss "origin remote is not DatabaseCleaner/database_cleaner (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "lib/database_cleaner.rb" \\
  && ok "lib/database_cleaner.rb" \\
  || miss "missing critical file: lib/database_cleaner.rb"
test -f "lib/database_cleaner/cleaner.rb" \\
  && ok "lib/database_cleaner/cleaner.rb" \\
  || miss "missing critical file: lib/database_cleaner/cleaner.rb"
test -f "lib/database_cleaner/strategy.rb" \\
  && ok "lib/database_cleaner/strategy.rb" \\
  || miss "missing critical file: lib/database_cleaner/strategy.rb"
test -f "lib/database_cleaner/cleaners.rb" \\
  && ok "lib/database_cleaner/cleaners.rb" \\
  || miss "missing critical file: lib/database_cleaner/cleaners.rb"
test -f "lib/database_cleaner/core.rb" \\
  && ok "lib/database_cleaner/core.rb" \\
  || miss "missing critical file: lib/database_cleaner/core.rb"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 336 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~306d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/DatabaseCleaner/database_cleaner"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

database_cleaner is a Ruby gem framework that provides pluggable strategies for clearing test databases across multiple ORMs (ActiveRecord, Sequel, Mongoid, Redis, etc.) and databases (MySQL, PostgreSQL, SQLite, MongoDB). It ensures a clean database state between tests by supporting deletion, truncation, and transaction-based approaches, eliminating test data pollution without writing ORM-specific cleanup code. Monorepo structure with lib/database_cleaner/ containing the core strategy system (lib/database_cleaner/cleaner.rb manages individual cleaners, lib/database_cleaner/cleaners.rb orchestrates multiple). Separate database_cleaner-core.gemspec and database_cleaner.gemspec split core from full gem. examples/ provides runnable Cucumber scenarios for ActiveRecord and Redis. The Safeguard (lib/database_cleaner/safeguard.rb) prevents accidental data loss on production.

👥Who it's for

Ruby on Rails and general Ruby developers writing test suites who need to ensure database isolation between tests without manually crafting cleanup logic for each ORM they use. Test engineers building CI/CD pipelines need this to guarantee repeatable, isolated test runs across multiple databases.

🌱Maturity & risk

Highly mature and production-ready. The project shows active maintenance with CI/CD via GitHub Actions (.github/workflows/ci.yml), comprehensive test coverage via Cucumber features (features/cleaning.feature, features/cleaning_multiple_dbs.feature), semantic versioning compliance, and established adapter ecosystem with separate well-maintained gems. Last activity visible through organized feature tests and example configurations suggests stable, ongoing use.

Low risk overall, but the modular gem split (database_cleaner-core is separate, adapters are separate repos) means core updates require coordinated releases across adapter repos. Single entry point (lib/database_cleaner.rb) acts as a facade, making breaking changes in core affect all adapters. No obvious deprecated code visible, but History.rdoc should be checked for past breaking changes when upgrading.

Active areas of work

No specific open work visible in file list, but the repo maintains Cucumber examples (examples/features/example.feature, example_multiple_db.feature, example_multiple_orm.feature) showing active support for multi-database and multi-ORM scenarios. CI pipeline is active (.github/workflows/ci.yml). The separate adapter gems (redis, active_record, sequel, mongoid, mongo, data_mapper) are maintained in parallel repos.

🚀Get running

git clone https://github.com/DatabaseCleaner/database_cleaner.git && cd database_cleaner && bundle install && bundle exec rake && bundle exec cucumber

Daily commands: bundle exec rspec (for unit tests, implied by .rspec config) or bundle exec cucumber (for integration tests in features/ and examples/features/). Use Guardfile (bundle exec guard) for continuous test running during development.

🗺️Map of the codebase

  • lib/database_cleaner.rb — Main entry point that exports the public API and initializes all core cleaning infrastructure.
  • lib/database_cleaner/cleaner.rb — Core Cleaner class that manages database-specific cleaning strategies and orchestrates the cleaning lifecycle.
  • lib/database_cleaner/strategy.rb — Base Strategy class that all database-specific cleaning implementations inherit from; defines the interface contract.
  • lib/database_cleaner/cleaners.rb — Manages multiple cleaner instances across different databases and ORM configurations.
  • lib/database_cleaner/core.rb — Core module providing low-level initialization, configuration, and state management for all cleaners.
  • database_cleaner.gemspec — Gem manifest defining dependencies, version, and which strategy adapters are available.

🧩Components & responsibilities

  • DatabaseCleaner module (Ruby module) — Public API; exposes start, clean, clean_with_truncation; delegates to Core singleton
    • Failure mode: If strategy cannot be loaded, raises NoMethodError or LoadError
  • Cleaners (Ruby class) — Registry of Cleaner instances indexed by database name; coordinates multi-DB cleaning
    • Failure mode: If database not configured, returns NullStrategy silently or raises if strict mode enabled
  • Cleaner (Ruby class) — Wraps a single database connection and strategy; manages lifecycle (start, clean)
    • Failure mode: If strategy.start fails, cleaning will

🛠️How to make changes

Add a new database adapter strategy

  1. Create a new strategy class inheriting from Strategy in lib/database_cleaner/strategies/[orm_name]/[strategy_name].rb (lib/database_cleaner/strategy.rb)
  2. Implement required methods: start, cleaning, and clean_with_truncation with ORM-specific logic (lib/database_cleaner/strategy.rb)
  3. Register strategy in the cleaner loader by adding to Cleaners.load_strategy(db_type, strategy) (lib/database_cleaner/cleaners.rb)
  4. Add test cases using shared_examples from lib/database_cleaner/spec/shared_examples.rb (lib/database_cleaner/spec/shared_examples.rb)
  5. Update ADAPTERS.md documentation with new adapter name and strategy name (ADAPTERS.md)

Configure cleaning for a specific test framework

  1. Create or update framework integration in lib/database_cleaner/[framework].rb (e.g., spec.rb for RSpec) (lib/database_cleaner/spec.rb)
  2. Use Core.cleaners to access Cleaners instance and call start/clean at appropriate hooks (lib/database_cleaner/core.rb)
  3. Ensure Safeguard.check_url is called to prevent production data loss (lib/database_cleaner/safeguard.rb)
  4. Add example usage in examples/features/step_definitions/ for user reference (examples/features/step_definitions/activerecord_steps.rb)

Support multiple databases in a single test run

  1. Use Cleaners#[] to instantiate separate Cleaner per database name: DatabaseCleaner[:secondary_db] (lib/database_cleaner/cleaners.rb)
  2. Call start/clean on each Cleaner instance, or use Cleaners.all to apply operation to all databases (lib/database_cleaner/cleaner.rb)
  3. Reference features/cleaning_multiple_dbs.feature for Cucumber examples and features/step_definitions for implementation (features/cleaning_multiple_dbs.feature)

🔧Why these technologies

  • Ruby gem architecture — Modular design allows users to install only adapters they need; core gem is lightweight
  • Strategy pattern (abstract Strategy base class) — Decouples ORM/database-specific logic from orchestration; simplifies adding new databases
  • Singleton registry (Core module) — Global cleaner access without passing instances through call stack; thread-safe configuration
  • Cucumber and RSpec integration hooks — Automatic cleanup without manual setup; reduces boilerplate in test suites

⚖️Trade-offs already made

  • Lazy-load strategy adapters instead of requiring all at startup

    • Why: Reduces startup time and dependency footprint when only one ORM is used
    • Consequence: First clean call has overhead; requires require guard in adapter gems
  • Global singleton registry (DatabaseCleaner module-level state)

    • Why: Simpler API and compatibility with existing test code
    • Consequence: Makes multi-threaded test runners harder to configure; requires explicit per-thread setup
  • Support transaction-based cleaning (rollback) vs. truncation strategies

    • Why: Transactions are faster for most ORMs; truncation is more thorough but slower
    • Consequence: Users must choose strategy; wrong choice can cause test flakiness or performance issues
  • Safeguard checks (prevent production data loss)

    • Why: Critical safety feature; avoids catastrophic mistakes
    • Consequence: Adds validation overhead; can block cleaning if environment detection fails

🚫Non-goals (don't propose these)

  • Does not implement database-specific adapters—those live in separate gems (database_cleaner-active_record, database_cleaner-sequel, etc.)
  • Not a real-time cache invalidation system—strategies are designed for test isolation, not live application cleanup
  • Does not handle database schema migrations or versioning; only clears data
  • Does not provide transaction management for production code; intended for test environments only

🪤Traps & gotchas

database_cleaner delegates to ORM-specific adapter gems (database_cleaner-active_record, etc.), so core lib/ won't work without loading an adapter—this is by design but catches new users. The Safeguard checks for RACK_ENV != 'production' and DATABASE_URL/database_cleaner_allow_production env vars, so local tests with production-looking configs might be blocked. Multiple DB support requires explicit setup in lib/database_cleaner/cleaners.rb—naive code only cleans default DB. Cucumber hook in lib/database_cleaner/cucumber.rb requires env.rb setup (see features/support/env.rb) or cleaning won't run automatically.

🏗️Architecture

💡Concepts to learn

  • Strategy Pattern — core/lib/database_cleaner/cleaner.rb uses strategy objects (deletion, truncation, transaction) swapped at runtime; understanding this pattern is key to adding new cleaning modes
  • Adapter Pattern — Each ORM (ActiveRecord, Redis, Mongoid) gets a separate adapter gem implementing a common interface; new contributors must understand how adapters plug into the core
  • NullObject Pattern — lib/database_cleaner/null_strategy.rb implements no-op cleaner; prevents nil checks throughout codebase and gracefully handles missing adapters
  • Facade Pattern — lib/database_cleaner.rb acts as single simplified entry point hiding complexity of lib/database_cleaner/cleaners.rb and strategy registry
  • Transaction Isolation Levels — One cleaning strategy uses database transactions to roll back test data; understanding isolation levels (SERIALIZABLE, READ_COMMITTED) affects which strategy works for your use case
  • Cucumber Before/After Hooks — lib/database_cleaner/cucumber.rb registers Before/After hooks to automate cleanup; Gherkin scenario isolation depends on understanding Cucumber's hook execution order
  • ORM Abstraction — The core strategy layer is ORM-agnostic but adapters (active_record, sequel, mongoid) adapt their ORM's API to common cleaning interface; understanding how each ORM differs (AR delete_all vs Sequel delete) is critical
  • DatabaseCleaner/database_cleaner-active_record — Official ActiveRecord adapter gem; most commonly used by Rails developers and implements the core strategy interface for deletion/truncation/transaction modes
  • DatabaseCleaner/database_cleaner-redis — Official Redis adapter gem; example of how to adapt non-SQL datastore into database_cleaner strategy pattern
  • thoughtbot/factory_bot — Complementary test tool for building test data; often used with database_cleaner to create+populate data then clean between tests
  • cucumber/cucumber-ruby — Test framework that database_cleaner integrates with via lib/database_cleaner/cucumber.rb for automatic before/after hooks
  • rspec/rspec — Unit test framework for Ruby; database_cleaner works with RSpec via lib/database_cleaner/spec.rb for test isolation

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive adapter integration tests in CI workflow

The repo supports multiple database adapters (ActiveRecord, DataMapper, Sequel, Redis) as mentioned in ADAPTERS.md, but .github/workflows/ci.yml likely only tests a subset. New contributors should add matrix strategy testing for each adapter to catch adapter-specific regressions early.

  • [ ] Review current .github/workflows/ci.yml to identify which adapters are tested
  • [ ] Add matrix strategy in ci.yml to test against ActiveRecord, DataMapper, Sequel, and Redis adapters separately
  • [ ] Reference existing adapter helpers in spec/support/active_record_helper.rb, spec/support/data_mapper_helper.rb, spec/support/sequel_helper.rb to set up test databases
  • [ ] Verify docker-compose.yml has services for all databases and update if needed
  • [ ] Document the new matrix setup in CONTRIBUTE.markdown

Create missing unit tests for lib/database_cleaner/safeguard.rb

The safeguard.rb module exists but has only minimal coverage in spec/database_cleaner/safeguard_spec.rb. This is a critical safety feature that deserves comprehensive tests for edge cases and failure scenarios.

  • [ ] Examine lib/database_cleaner/safeguard.rb to understand its full API and edge cases
  • [ ] Review existing spec/database_cleaner/safeguard_spec.rb to identify coverage gaps
  • [ ] Add tests for: invalid configurations, missing database connections, rollback failures, and concurrent access scenarios
  • [ ] Add tests verifying safeguard behavior with each database adapter (use helpers from spec/support/)
  • [ ] Ensure tests validate both positive (safe operations) and negative (prevented dangerous operations) cases

Document and test multiple database cleaning strategies in features/cleaning_multiple_dbs.feature

The repository has features/cleaning_multiple_dbs.feature but step definitions in features/step_definitions/database_cleaner_steps.rb may be incomplete. New contributors should expand scenarios to cover all supported strategies (truncation, deletion, transaction) with multiple databases.

  • [ ] Review features/cleaning_multiple_dbs.feature for all scenario coverage
  • [ ] Add missing scenarios for each strategy type (truncation, deletion, transaction) applied to multiple databases simultaneously
  • [ ] Ensure corresponding step definitions exist in features/step_definitions/database_cleaner_steps.rb; add missing steps
  • [ ] Verify scenarios use the database configurations from examples/config/database.yml.example
  • [ ] Update features/support/env.rb if needed to support full multi-database test setup

🌿Good first issues

  • Add integration test coverage for edge case of cleaning a database that is offline or unreachable; currently features/ and examples/ don't show graceful degradation handling: features/cleaning.feature, lib/database_cleaner/cleaner.rb
  • Create an example scenario in examples/features/ demonstrating cleaning with ActiveRecord + Redis + a third ORM simultaneously; only example_multiple_orm.feature shows AR + one other: examples/features/example_multiple_orm.feature, examples/features/step_definitions/
  • Document (in CONTRIBUTE.markdown or new ADAPTER_DEVELOPMENT.md) the exact interface a new adapter gem must implement; currently only inferred from adapter repos: CONTRIBUTE.markdown, ADAPTERS.md

Top contributors

Click to expand

📝Recent commits

Click to expand
  • f50ccb2 — [DX] Improve CONTRIBUTE.md instructions (#721) (arielj)
  • f4780bc — 3.4 quoted locations start with single quotes (#719) (pat)
  • 1f5946a — Update main branch references (#718) (foobear)
  • 23fb7a7 — Bump version to 2.1.0 (etagwerker)
  • 892dd1b — Add details about v2.1.0 (etagwerker)
  • cbc8408 — Skip conventional sqlite:// URLs from safeguards (#715) (timriley)
  • f458536 — Update for trilogy (#707) (petergoldstein)
  • b38eec7 — Add Ruby 3.3 to CI matrix (#710) (m-nakamura145)
  • 0bb7cd0 — Point CI badge to GitHub Actions (#706) (petergoldstein)
  • 64e2ef7 — Merge pull request #699 from DatabaseCleaner/feature/relax-dependency (etagwerker)

🔒Security observations

The DatabaseCleaner repository shows a generally secure posture as a testing utility library with minimal direct security exposure. The main concerns are: (1) Redis exposed without authentication in docker-compose.yml, though bound to localhost; (2) lack of a security policy; (3) presence of example configuration files that could reveal database patterns. The codebase focuses on database cleaning strategies and does not appear to contain obvious SQL injection, XSS, or hardcoded credentials. However, a complete dependency audit could not be performed due to incomplete dependency information. Regular security maintenance and dependency monitoring are recommended.

  • Medium · Redis Service Exposed on Localhost Without Authentication — docker-compose.yml - redis service configuration. The docker-compose.yml configuration exposes Redis on 127.0.0.1:6379 without requiring authentication. While bound to localhost, this could be a risk if the Docker host is accessible to other users or services. Redis server is started without the --requirepass flag, meaning no password authentication is enforced. Fix: Enable Redis authentication by adding '--requirepass <strong_password>' to the redis-server command and configure clients to authenticate. Additionally, consider removing the ports binding if Redis only needs to be accessed by containers within the same network.
  • Low · Missing Security Policy Documentation — Repository root. The repository does not appear to have a SECURITY.md or security policy file visible in the file structure. This makes it unclear how security vulnerabilities should be reported responsibly. Fix: Create a SECURITY.md file documenting the process for reporting security vulnerabilities, expected response times, and any bug bounty information if applicable.
  • Low · Example Configuration Files in Repository — examples/config/ and spec/support/ directories. The repository contains example configuration files (examples/config/database.yml.example, examples/config/redis.yml, spec/support/example.database.yml) which may inadvertently reveal database connection patterns that could inform attacks. Fix: Ensure example configuration files do not contain real credentials and include comments warning developers about security risks. Document that these are examples only and should never be used in production.
  • Low · Dependencies Information Incomplete — Gemfile and database_cleaner-core.gemspec. No Gemfile or package dependency information was provided in the security analysis context. It's unclear if the project has outdated or vulnerable gem dependencies. Fix: Regularly audit gem dependencies using tools like 'bundle audit' and 'bundler-audit'. Keep all dependencies updated and monitor security advisories. Implement automated dependency scanning in CI/CD pipeline.
  • Low · Sample Database Configuration Included — spec/support/sample.config.yml, examples/config/database.yml.example. The file spec/support/sample.config.yml and examples/config/database.yml.example are included in the repository. While these appear to be examples, they may contain patterns or placeholders that could be misused. Fix: Ensure these files contain only sanitized, non-production data. Add clear warnings in comments and documentation that these should never be used as templates for production configurations.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Healthy signals · DatabaseCleaner/database_cleaner — RepoPilot