RepoPilotOpen in app →

g1879/DrissionPage

Python based web automation tool. Powerful and elegant.

Mixed

Mixed signals — read the receipts

weakest axis
Use as dependencyConcerns

non-standard license (Other); no tests detected…

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 4d ago
  • 17 active contributors
  • Other licensed
Show all 7 evidence items →
  • Concentrated ownership — top contributor handles 78% of recent commits
  • Non-standard license (Other) — review terms
  • No CI workflows detected
  • No test directory detected
What would change the summary?
  • Use as dependency ConcernsMixed if: clarify license terms

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Forkable" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Forkable
[![RepoPilot: Forkable](https://repopilot.app/api/badge/g1879/drissionpage?axis=fork)](https://repopilot.app/r/g1879/drissionpage)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/g1879/drissionpage on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: g1879/DrissionPage

Generated by RepoPilot · 2026-05-07 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/g1879/DrissionPage shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

WAIT — Mixed signals — read the receipts

  • Last commit 4d ago
  • 17 active contributors
  • Other licensed
  • ⚠ Concentrated ownership — top contributor handles 78% of recent commits
  • ⚠ Non-standard license (Other) — review terms
  • ⚠ No CI workflows detected
  • ⚠ No test directory detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live g1879/DrissionPage repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/g1879/DrissionPage.

What it runs against: a local clone of g1879/DrissionPage — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in g1879/DrissionPage | Confirms the artifact applies here, not a fork | | 2 | License is still Other | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 34 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>g1879/DrissionPage</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of g1879/DrissionPage. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/g1879/DrissionPage.git
#   cd DrissionPage
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of g1879/DrissionPage and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "g1879/DrissionPage(\\.git)?\\b" \\
  && ok "origin remote is g1879/DrissionPage" \\
  || miss "origin remote is not g1879/DrissionPage (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Other)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Other\"" package.json 2>/dev/null) \\
  && ok "license is Other" \\
  || miss "license drift — was Other at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "DrissionPage/__init__.py" \\
  && ok "DrissionPage/__init__.py" \\
  || miss "missing critical file: DrissionPage/__init__.py"
test -f "DrissionPage/_pages/chromium_page.py" \\
  && ok "DrissionPage/_pages/chromium_page.py" \\
  || miss "missing critical file: DrissionPage/_pages/chromium_page.py"
test -f "DrissionPage/_pages/session_page.py" \\
  && ok "DrissionPage/_pages/session_page.py" \\
  || miss "missing critical file: DrissionPage/_pages/session_page.py"
test -f "DrissionPage/_base/chromium.py" \\
  && ok "DrissionPage/_base/chromium.py" \\
  || miss "missing critical file: DrissionPage/_base/chromium.py"
test -f "DrissionPage/_configs/options_manage.py" \\
  && ok "DrissionPage/_configs/options_manage.py" \\
  || miss "missing critical file: DrissionPage/_configs/options_manage.py"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 34 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~4d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/g1879/DrissionPage"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

DrissionPage is a Python web automation framework that unifies browser control (Chromium-based) with HTTP request handling, eliminating the need to choose between Selenium's convenience and requests' efficiency. It features a custom-built engine (no WebDriver dependency), cross-iframe element traversal, simultaneous multi-tab control, and integrated download utilities with lxml-based parsing. Monolithic package under DrissionPage/ with logical domain separation: _base/ (core browser/driver abstraction), _pages/ (ChromiumPage, ChromiumTab, MixTab implementations), _elements/ (element wrappers for Chromium and requests), _configs/ (options management via INI), _functions/ (utilities for locators, cookies, text parsing, browser control).

👥Who it's for

Python developers and QA engineers who need web scraping, automation testing, or data extraction without the overhead of WebDriver version management; particularly those working with Chromium browsers, electron apps, or sites requiring both JavaScript execution and HTTP-level control.

🌱Maturity & risk

Actively maintained single-author project with 859KB of Python code and comprehensive file structure suggesting production readiness. However, maintainer indicates heavy workload (requests donations), and no visible CI/test data in provided metadata. Likely stable for common use cases but represents moderate-to-high single-maintainer risk.

Single-maintainer project (g1879) with no visible test suite or CI pipeline in file list. Dependency on DownloadKit≥2.0.7, websocket-client, and tldextract adds maintenance surface. No evidence of issue triage velocity or backwards compatibility policy. Chinese-language primary documentation (drissionpage.cn) may limit English-speaking contributor pool.

Active areas of work

No specific PR, milestone, or commit data provided in metadata. Based on structure, likely ongoing work on frame handling (chromium_frame.py), multi-tab management (chromium_tab.py, mix_tab.py), and element querying optimizations. Chinese community engagement via WeChat group suggests active user feedback loop.

🚀Get running

Check README for instructions.

Daily commands: No traditional dev server. For development: python -m DrissionPage._functions.cli for CLI tools. For interactive use: instantiate ChromiumPage() or SessionPage() in Python REPL. No Makefile visible; relies on direct package imports.

🗺️Map of the codebase

  • DrissionPage/__init__.py — Main entry point exposing public API; defines all exported classes and functions that users import
  • DrissionPage/_pages/chromium_page.py — Core browser automation page class implementing Chromium control and web element interaction
  • DrissionPage/_pages/session_page.py — HTTP session-based page class for requests-style data collection, complements browser automation
  • DrissionPage/_base/chromium.py — Low-level Chromium WebDriver communication layer; foundation for all browser control
  • DrissionPage/_configs/options_manage.py — Configuration management system that loads and manages browser/session options from INI files
  • DrissionPage/_units/selector.py — Element selection engine supporting CSS, XPath, and custom locator strategies across page types
  • DrissionPage/_elements/chromium_element.py — Chromium element wrapper providing unified interface for interacting with DOM elements

🛠️How to make changes

Add a custom element locator strategy

  1. Define new locator type in DrissionPage/common.py as a new enum value in relevant enums (DrissionPage/common.py)
  2. Add parsing logic to handle the new locator format (DrissionPage/_functions/locator.py)
  3. Implement selection logic in the Selector class for both ChromiumElement and SessionElement contexts (DrissionPage/_units/selector.py)
  4. Add corresponding method to element classes to expose the new selector type (DrissionPage/_elements/chromium_element.py)

Add a new browser action or interaction

  1. Implement the action logic in DrissionPage/_units/actions.py or create a new unit file if complex (DrissionPage/_units/actions.py)
  2. Expose the action via a public method on ChromiumPage or ChromiumElement (DrissionPage/_pages/chromium_page.py)
  3. Add type hints and documentation via corresponding .pyi stub file (DrissionPage/_pages/chromium_page.pyi)

Add a new configuration option

  1. Add the option to the default configs.ini file with documentation (DrissionPage/_configs/configs.ini)
  2. Add property/setter to ChromiumOptions or SessionOptions class (DrissionPage/_configs/chromium_options.py)
  3. Load and apply the option in the OptionsManager initialization (DrissionPage/_configs/options_manage.py)
  4. Update the .pyi stub to reflect the new configuration interface (DrissionPage/_configs/chromium_options.pyi)

Extend element capabilities for new page type

  1. Create new element wrapper class inheriting from base pattern in DrissionPage/_elements/ (DrissionPage/_elements/chromium_element.py)
  2. Implement find/get methods using appropriate underlying library (WebDriver API, lxml, etc) (DrissionPage/_units/selector.py)
  3. Expose element class via the page class get_element/get_elements methods (DrissionPage/_pages/chromium_page.py)

🔧Why these technologies

  • Chromium WebDriver (CDP) — Provides low-level browser control without Selenium overhead; enables debugging, JS execution, and network interception
  • Requests library — Lightweight HTTP client for efficient headless data collection without full browser overhead
  • lxml + cssselect — Fast XML/HTML parsing and CSS selector evaluation for session-based pages; avoids rendering cost
  • Click (CLI) — Provides command-line interface for library functions and utilities
  • DownloadKit — Manages Chromium binary downloads and version compatibility across OS platforms

⚖️Trade-offs already made

  • Separate ChromiumPage and SessionPage classes instead of unified interface

    • Why: ChromiumPage requires heavyweight browser process; SessionPage is lightweight. Hard to abstract both to single interface without performance loss.
    • Consequence: Users must choose page type upfront, but get optimal performance. MixTab bridges for hybrid scenarios.
  • Custom WebDriver implementation vs Selenium

    • Why: Selenium adds abstraction overhead; CDP protocol is faster and more flexible for this tool's use cases
    • Consequence: Less community ecosystem support but faster, simpler codebase; not compatible with non-Chromium browsers
  • Configuration via .ini file + Python code

    • Why: INI provides persistence and default values; Python code allows programmatic override
    • Consequence: Two sources of truth, but enables both convenience defaults and programmatic flexibility
  • Eager element finding vs lazy evaluation

    • Why: Elements are found immediately to fail fast; lazy evaluation would defer errors to action time
    • Consequence: More DOM queries but clearer debugging; better for dynamic content detection

🚫Non-goals (don't propose these)

  • Multi-browser support (Firefox, Safari): Chromium-only by design for consistency
  • Real-time collaborative automation: Single-client tool, not distributed
  • Mobile native automation: Web-browser focused, not Android/iOS apps
  • JavaScript test framework integration: Standalone tool, not test-runner plugin
  • Concurrent tab automation within single page object: Sequential tab switching model

🪤Traps & gotchas

No explicit test framework or pytest config visible; testing approach unknown. INI config file (DrissionPage/_configs/configs.ini) must exist and be writable for persistent settings. WebSocket connection to Chromium requires live browser instance; automation will fail silently if browser crashes without restart logic. Type hints (.pyi files) are manually maintained—easy to desync if .py files change. No visible version pinning in dependencies; DownloadKit≥2.0.7 is loose constraint.

🏗️Architecture

💡Concepts to learn

  • WebSocket Protocol for DevTools — DrissionPage communicates with Chromium via WebSocket (websocket-client dependency) instead of WebDriver HTTP; this is the architectural core enabling speed and feature parity
  • Shadow DOM / Shadow Root Traversal — README explicitly claims ability to handle non-open shadow-root elements; this requires special DevTools protocol handling not in standard WebDriver
  • Cross-Origin IFrame Navigation — Core feature: 'can search across iframes without switching'; requires protocol-level element discovery, not DOM-level queries
  • POM (Page Object Model) — DrissionPage explicitly supports POM pattern for test automation via encapsulated page classes (chromium_page.py, mix_tab.py)
  • Dual-Mode Request Handling — Unique to DrissionPage: ability to switch between browser-executed requests (JavaScript) and direct HTTP requests (requests library) on the same session
  • lxml XPath / CSS Selectors — DrissionPage uses lxml (not html.parser or BeautifulSoup) for 'several orders of magnitude' parsing speed improvement; cssselect bridges CSS to XPath
  • SeleniumHQ/selenium — Direct competitor for browser automation; DrissionPage positioned as faster, WebDriver-free alternative
  • psf/requests — Complementary HTTP library; DrissionPage bridges requests and browser automation in unified API
  • scrapy/scrapy — Competing web scraping framework; Scrapy for data extraction at scale, DrissionPage for JavaScript-heavy sites
  • microsoft/playwright-python — Modern browser automation alternative; supports multiple engines but heavier than DrissionPage's Chromium-only approach
  • g1879/DownloadKit — Companion library by same author; provides download management utilities integrated into DrissionPage

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for DrissionPage/_functions/locator.py

The locator module is critical for element selection across both Chromium and session-based pages, but there are no visible test files in the repo structure. Adding unit tests would improve reliability of the core locator functionality and serve as documentation for contributors. This is a high-impact module used throughout the codebase (_elements, _pages modules depend on it).

  • [ ] Create tests/test_locator.py with test cases for locator parsing, XPath generation, and CSS selector handling
  • [ ] Add test cases for edge cases: invalid locators, special characters, complex selectors
  • [ ] Test locator compatibility between Chromium and session-based pages
  • [ ] Ensure tests cover the public API in DrissionPage/_functions/locator.pyi

Add GitHub Actions workflow for automated testing across Python versions

The repo supports Python 3.6+, Windows/Linux/Mac, and multiple browser types, but no CI workflow is visible in .github/. Adding a matrix-based GitHub Actions workflow would catch regressions early, validate cross-platform compatibility, and reduce manual testing burden for maintainers during PR reviews.

  • [ ] Create .github/workflows/tests.yml with matrix strategy for Python 3.6, 3.8, 3.10, 3.11, 3.12
  • [ ] Add OS matrix: ubuntu-latest, windows-latest, macos-latest
  • [ ] Configure dependency installation from requirements (requests, lxml, cssselect, DownloadKit, websocket-client, click, tldextract, psutil)
  • [ ] Add pytest execution step pointing to a tests/ directory (currently missing)

Create integration tests for DrissionPage/_pages/mix_tab.py

The MixTab class uniquely combines browser automation and session-based requests in a single interface. This hybrid approach is a core differentiator of DrissionPage but lacks visible test coverage. Adding integration tests would validate the synchronization between chromium and session components and ensure data consistency when switching modes.

  • [ ] Create tests/test_mix_tab.py for session-to-chromium and chromium-to-session transitions
  • [ ] Test cookie/header synchronization between ChromiumTab and SessionPage components
  • [ ] Add test cases for mixed workflows: fetch with session, render with chromium, switch back to session
  • [ ] Verify that mix_tab.py properties and methods work correctly with both _base/chromium.py and _pages/session_page.py

🌿Good first issues

  • Add pytest test suite for DrissionPage/_elements/session_element.py — currently no test coverage visible for requests-based element handling: Identifies untested code path and establishes testing baseline
  • Document locator.py syntax in code examples with docstrings — the 'simplified locator syntax' is core to UX but lacks inline examples: Reduces onboarding friction and supports auto-generated docs
  • Add integration test for cross-iframe element traversal in DrissionPage/_pages/chromium_page.py — major differentiator vs Selenium but untested: Validates a key selling point and prevents regressions

Top contributors

Click to expand

📝Recent commits

Click to expand
  • f1caf7f — Merge pull request #641 from hamflx/master (g1879)
  • b5b510f — Merge pull request #661 from DullJZ/master (g1879)
  • d218075 — Merge pull request #665 from M4rque2/master (g1879)
  • 40c6daa — Merge pull request #593 from Xeonacid/patch-1 (g1879)
  • 0595703 — Merge pull request #629 from vmalik25/pdf-download-fix (g1879)
  • 11f5256 — Merge pull request #658 from Flikify/master (g1879)
  • 2b87ed6 — Merge pull request #635 from luojiaaoo/patch-1 (g1879)
  • 072a492 — Merge pull request #604 from xiyuan-cdp/master (g1879)
  • 8d77172 — Update README.md (g1879)
  • 21ce527 — Update README.md (g1879)

🔒Security observations

Failed to generate security analysis.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Mixed signals · g1879/DrissionPage — RepoPilot