RepoPilotOpen in app →

ytdl-org/youtube-dl

Command-line program to download videos from YouTube.com and other video sites

WAIT

Solo project — review before adopting

  • Last commit 2mo ago
  • Unlicense licensed
  • CI configured
  • Tests present
  • Solo or near-solo (1 contributor visible)

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Embed this verdict

[![RepoPilot: WAIT](https://repopilot.app/api/badge/ytdl-org/youtube-dl)](https://repopilot.app/r/ytdl-org/youtube-dl)

Paste into your README — the badge live-updates from the latest cached analysis.

Onboarding doc

Onboarding: ytdl-org/youtube-dl

Generated by RepoPilot · 2026-05-05 · Source

Verdict

WAIT — Solo project — review before adopting

  • Last commit 2mo ago
  • Unlicense licensed
  • CI configured
  • Tests present
  • ⚠ Solo or near-solo (1 contributor visible)

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

TL;DR

youtube-dl is a Python command-line tool that extracts and downloads video/audio streams from YouTube and 1000+ other video platforms. It solves the problem of programmatically retrieving media from sites that don't expose direct download links, by reverse-engineering each site's HTML/JS/API to locate stream URLs, handle DRM-free formats, and mux video/audio with ffmpeg. The core logic lives in youtube_dl/ — YoutubeDL.py is the main engine class, youtube_dl/extractor/ contains one file per supported site (e.g. youtube.py, vimeo.py), and youtube_dl/postprocessor/ handles ffmpeg muxing/conversion. The bin/youtube-dl entry point is a thin shell script; devscripts/ holds release automation and code generation utilities.

Who it's for

End users and developers who need to archive, process, or offline-view video content from dozens of platforms. Contributors are typically Python developers who reverse-engineer specific video site APIs to write or maintain site-specific 'extractor' plugins under youtube_dl/extractor/.

Maturity & risk

Extremely mature — one of the most-starred Python CLI tools on GitHub with a history stretching back to 2008. CI runs via GitHub Actions (.github/workflows/ci.yml), and there is a substantial AUTHORS file indicating broad community contribution. However, the ytdl-org fork has seen reduced maintenance velocity since 2021, with yt-dlp being its more actively developed successor.

The primary risk is site breakage: YouTube and other platforms frequently change their frontends/APIs, and with reduced maintainer activity on this fork, fixes can lag weeks or months. The project has minimal external Python dependencies (it ships its own AES implementation in devscripts/generate_aes_testdata.py and bundles utilities), but relies on optional external binaries (ffmpeg, ffprobe, AtomicParsley). Single-maintainer bottleneck is real — most merges go through a small core team.

Active areas of work

Based on visible repo data, recent activity centers on CI maintenance (.github/workflows/ci.yml) and issue template updates (.github/ISSUE_TEMPLATE/). There is no active milestone visible, and the primary development energy in this ecosystem has shifted to the yt-dlp fork. Devscripts like make_lazy_extractors.py suggest ongoing work on import performance optimization.

Get running

git clone https://github.com/ytdl-org/youtube-dl.git && cd youtube-dl && pip install -e . && youtube-dl --version

Or run directly without install:

python -m youtube_dl 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'

Daily commands:

Development run (no install):

python -m youtube_dl [OPTIONS] URL

Run tests:

python -m pytest test/

or

make test

Build standalone binary:

make youtube-dl

Map of the codebase

  • youtube_dl/__init__.py — Main entry point and orchestrator: parses CLI options, initializes YoutubeDL, and drives the entire download pipeline.
  • youtube_dl/YoutubeDL.py — Core engine class that manages extraction, format selection, downloading, post-processing, and all cross-cutting concerns like logging and retries.
  • youtube_dl/extractor/common.py — Base InfoExtractor class that all 1000+ site extractors inherit from; defines the extraction contract, helpers, and shared utilities.
  • youtube_dl/downloader/common.py — Base FileDownloader class that all protocol-specific downloaders extend, defining the download loop, progress reporting, and retry logic.
  • youtube_dl/postprocessor/common.py — Base PostProcessor class that all post-processors inherit from, defining the hook system used after download completes.
  • youtube_dl/utils.py — Massive shared utility module (~4000 lines) providing HTTP helpers, date parsing, sanitization, JS interpretation stubs, and dozens of other cross-cutting functions used everywhere.
  • youtube_dl/extractor/extractors.py — Central registry that imports and lists every extractor class, used by YoutubeDL to match URLs to the correct extractor.

How to make changes

Add support for a new video site

  1. Create a new extractor file named after the site, subclassing InfoExtractor. Implement _VALID_URL regex, _TEST dict, and _real_extract() returning a standardized info dict with url, title, formats, etc. (youtube_dl/extractor/mysite.py)
  2. Import and add your new extractor class to the _ALL_CLASSES list, maintaining alphabetical order. (youtube_dl/extractor/extractors.py)
  3. Add an import entry for your extractor module in the package init so gen_extractors() picks it up. (youtube_dl/extractor/__init__.py)
  4. Write a _TEST dict in your extractor or add a test entry to verify the URL matches and basic info is extracted correctly. (test/test_download.py)

Add a new post-processor

  1. Create a new file subclassing PostProcessor. Implement run(self, information) which returns (files_to_delete, information). Use self._downloader to log messages. (youtube_dl/postprocessor/mypp.py)
  2. Import and export your new PostProcessor class alongside the existing ones. (youtube_dl/postprocessor/__init__.py)
  3. Wire up the new post-processor to a CLI option in the argument parser and instantiate it in the postprocessors setup block. (youtube_dl/__init__.py)
  4. Add the add_post_processor() call in YoutubeDL where existing post-processors are registered based on options. (youtube_dl/YoutubeDL.py)

Add a new protocol downloader

  1. Create a new downloader file subclassing FileDownloader. Implement real_download(self, filename, info_dict) with the download loop and progress reporting via self._hook_progress(). (youtube_dl/downloader/myprotocol.py)
  2. Import your downloader and add it to the _get_suitable_downloader() selection logic that maps protocol/url patterns to downloader classes. (youtube_dl/downloader/common.py)
  3. Register the new downloader in the package init exports. (youtube_dl/downloader/__init__.py)

Add a new CLI option

  1. Add the new optparse option to the appropriate option group in the main argument parser function (parseOpts). (youtube_dl/__init__.py)
  2. Map the parsed option value into the ydl_opts dictionary that is passed to the YoutubeDL constructor. (youtube_dl/__init__.py)
  3. Add the option key to params handling inside YoutubeDL, accessing it via self.params.get('your_option') where the behavior should change. (youtube_dl/YoutubeDL.py)

Why these technologies

  • Python 2/3 compatible code — Maximizes portability across Linux, macOS, and Windows without requiring users to manage Python versions; the codebase predates widespread Python 3 adoption.
  • Pure-Python AES (aes.py) — Avoids hard dependency on PyCrypto/cryptography for HLS decryption, keeping the tool installable with zero native dependencies.
  • Custom JavaScript interpreter (jsinterp.py) — YouTube obfuscates its signature algorithm in JS; rather than shipping a full JS engine like Node.js, a lightweight interpreter handles only the constructs YouTube actually uses.
  • optparse (not argparse) — Historical compatibility with Python 2.6 which lacks argparse; changing would break Python 2 support.
  • FFmpeg as external post-processor — Handling every audio/video codec in pure Python is impractical; delegating to FFmpeg gives full format support while keeping the Python codebase focused on extraction.

Trade-offs already made

  • Monolithic YoutubeDL god-object
    • Why: Simplifies passing shared state (params, logger, cache, cookies) across extraction/download/post-processing without complex dependency injection.
    • Consequence: Hard to unit-test in isolation; adding features risks unintended side effects across the pipeline.

Traps & gotchas

  1. Python 2/3 compatibility is strictly maintained — avoid any syntax or stdlib usage that breaks Python 2.6. 2) ffmpeg must be installed separately for merging video+audio formats (common for YouTube 1080p+). 3) The lazy extractor file (youtube_dl/extractor/lazy_extractors.py) is generated — do not edit it manually; regenerate via python devscripts/make_lazy_extractors.py. 4) Site extractors must pass the test suite in test/test_download.py which makes real network requests — use --no-check-certificate or mock carefully in CI.

Architecture

Concepts to learn

  • HLS (HTTP Live Streaming) segment decryption — Many sites serve video as AES-128 encrypted .ts segments via M3U8 playlists — youtube-dl's aes.py and HLS extractor logic must decrypt each segment, making this central to supporting most modern streaming sites
  • DASH (Dynamic Adaptive Streaming over HTTP) — YouTube and others serve video/audio as separate DASH streams that must be detected, selected by quality, and muxed — understanding MPD manifest parsing is essential for the YouTube extractor
  • n-sig / nsig JavaScript signature deciphering — YouTube obfuscates stream URLs with a JavaScript-computed signature that youtube-dl must extract and evaluate using a Python JS interpreter shim — the most frequently broken and fixed part of the YouTube extractor
  • InfoExtractor plugin pattern — The entire multi-site support relies on a self-registering plugin system where each extractor declares a _VALID_URL regex — understanding this pattern is mandatory for adding or debugging any site
  • Output template string DSL — youtube-dl implements its own %(field)s-based filename templating DSL parsed in YoutubeDL.py — contributors must understand it to handle edge cases in filename sanitization across OSes
  • Lazy module loading via code generation — With 1000+ extractors, importing all at startup is slow — devscripts/make_lazy_extractors.py generates a module that defers imports until a URL is matched, a non-obvious performance pattern new contributors must not break

Related repos

  • yt-dlp/yt-dlp — Active fork of youtube-dl with faster extractor updates, additional features (SponsorBlock, chapters), and more frequent maintenance — the de facto successor
  • streamlink/streamlink — Alternative stream extractor focused on live streams and piping to media players rather than downloading to file
  • animelover1984/youtube-dl — Community fork with patches not yet merged upstream, useful for seeing pending extractor fixes
  • mikf/gallery-dl — Companion tool using similar extractor plugin architecture but focused on image galleries rather than video
  • rg3/youtube-dl — Original predecessor repo by the original author before the project moved to the ytdl-org organization

PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add unit tests for devscripts/utils.py helper functions

The devscripts/utils.py file contains utility functions used across build/dev scripts (e.g. read_version, write_file, etc.), but there are no corresponding unit tests for it in the test/ directory. Adding tests here would catch regressions in the release pipeline and developer tooling, which is critical since these scripts drive versioning, changelog generation, and GitHub releases.

  • [ ] Read devscripts/utils.py and enumerate all public functions (e.g. read_version, write_file, rmtree, etc.)
  • [ ] Create test/test_devscripts_utils.py mirroring the structure of existing test files like test/helper.py
  • [ ] Write unit tests for each function, using temporary directories/files where file I/O is involved
  • [ ] Ensure the new test file is discovered by the existing CI workflow in .github/workflows/ci.yml (check how test discovery is configured)
  • [ ] Run the full test suite locally with python -m pytest test/ and confirm all pass

Add a GitHub Actions workflow for automated binary build and release artifact upload

The repo has a CI workflow at .github/workflows/ci.yml for testing, but there is no workflow for building the self-contained youtube-dl binary (via py2exe / PyInstaller) and uploading release artifacts automatically. The devscripts/release.sh and devscripts/wine-py2exe.sh scripts exist for this purpose but are only run manually. Automating this as a GitHub Actions workflow on tag pushes would make releases reproducible and less error-prone.

  • [ ] Examine devscripts/release.sh and devscripts/wine-py2exe.sh to understand the build steps and dependencies required
  • [ ] Create .github/workflows/release.yml triggered on 'push' to tags matching 'v*'
  • [ ] Add a job that installs Python, zipimport, and required dependencies, then runs make youtube-dl to produce the standalone binary
  • [ ] Add an upload-artifact or gh-release step using actions/upload-release-asset to attach the binary to the GitHub Release
  • [ ] Reference devscripts/create-github-release.py to understand how release notes are structured and integrate that into the workflow
  • [ ] Test the workflow on a draft tag in a fork before opening the PR

Add a missing .github/ISSUE_TEMPLATE_tmpl/6_question.md template to match the deployed templates

The file structure shows that .github/ISSUE_TEMPLATE/ contains 6 templates (including 6_question.md), and .github/ISSUE_TEMPLATE_tmpl/ contains the source templates used to generate them — but ISSUE_TEMPLATE_tmpl only has 5 files (1 through 5), missing 6_question.md. The devscripts/make_issue_template.py script presumably generates ISSUE_TEMPLATE from ISSUE_TEMPLATE_tmpl, so the missing source template means the question template cannot be regenerated or updated via that script, breaking the templating pipeline.

  • [ ] Read devscripts/make_issue_template.py to understand how it reads from .github/ISSUE_TEMPLATE_tmpl/ and generates .github/ISSUE_TEMPLATE/
  • [ ] Read the existing .github/ISSUE_TEMPLATE/6_question.md to understand its current content and structure
  • [ ] Create .github/ISSUE_TEMPLATE_tmpl/6_question.md with the appropriate template source syntax consistent with the other _tmpl files (e.g. 5_feature_request.md)
  • [ ] Run python devscripts/make_issue_template.py and verify it regenerates .github/ISSUE_TEMPLATE/6_question.md correctly
  • [ ] Confirm the generated file matches (or intentionally improves upon) the existing 6_question.md and commit both files

Good first issues

  1. Several extractor files in youtube_dl/extractor/ lack corresponding test cases in test/test_download.py — pick a small site extractor (e.g. a niche news site) and add a _TEST dict with a real URL and expected metadata fields. 2) devscripts/make_supportedsites.py generates the supported sites list but the output formatting could be improved to include extractor feature flags (live support, geo-restriction handling) — visible gap vs. yt-dlp's feature matrix. 3) The devscripts/cli_to_api.py helper script lacks inline documentation explaining the param mapping between CLI flags and the YoutubeDL options dict — adding docstrings would directly help embedders.

Top contributors

Recent commits

  • 956b8c5 — [YouTube] Bug-fix for c1f5c3274a (dirkf)
  • d5f5611 — [core] Re-work format_note display in format list with abbreviated codec name (dirkf)
  • d0283f5 — [YouTube] Revert forcing player JS by default (dirkf)
  • 6315f4b — [utils] Support additional codecs and dynamic_range (dirkf)
  • aeb1254 — [YouTube] Fix playlist thumbnail extraction (dirkf)
  • 25890f2 — [YouTube] Improve detection of geo-restriction (dirkf)
  • d65882a — [YouTube] Improve mark_watched() (dirkf)
  • 39378f7 — [YouTube] Fix incorrect chapter extraction (dirkf)
  • 6f5d4c3 — [YouTube] Improve targeting of pre-roll wait (dirkf)
  • 5d445f8 — [YouTube] Re-work client selection (dirkf)

Security observations

  • High · Insecure Binary Download via HTTP/HTTPS without Integrity Verification — README.md (installation instructions). The README instructs users to download the youtube-dl binary directly using curl or wget from yt-dl.org without any checksum or signature verification. This exposes users to man-in-the-middle attacks or supply chain compromises where a malicious binary could be served instead of the legitimate one. Fix: Provide SHA256 checksums or GPG signatures alongside the download instructions and instruct users to verify them before executing. Consider using a package manager with built-in integrity checks.
  • High · Arbitrary Code Execution via Shell Commands in devscripts — devscripts/release.sh, devscripts/buildserver.py, devscripts/wine-py2exe.sh. Several devscripts (release.sh, wine-py2exe.sh, posix-locale.sh, run_tests.sh) execute shell commands, potentially with user-supplied or externally fetched data. If any inputs are not properly sanitized, this could lead to command injection. The buildserver.py and other scripts may invoke subprocesses or shell commands unsafely. Fix: Audit all subprocess and shell invocations to ensure inputs are properly sanitized or use subprocess with argument lists (avoiding shell=True in Python). Avoid passing unsanitized external data to shell commands.
  • High · Potential Command Injection via URL/Filename Inputs — youtube_dl/ (core downloader and postprocessors). youtube-dl processes user-supplied URLs and filenames which are passed to external programs (e.g., ffmpeg, avconv, rtmpdump). If filenames or URLs are improperly escaped before being passed to subprocess calls with shell=True, this could allow command injection attacks. The output template system allows user-controlled strings to be used in file paths. Fix: Ensure all external process invocations use subprocess with a list of arguments rather than shell=True. Sanitize and validate filenames and URLs before use. Restrict dangerous characters in output templates.
  • Medium · Unverified SSL/TLS Certificate Handling — youtube_dl/ (network request handling). youtube-dl provides a --no-check-certificate option that disables SSL certificate verification. While this is user-initiated, it significantly weakens transport security and could expose users to MITM attacks. Additionally, the default configuration may not enforce strict certificate pinning for all extractor requests. Fix: Keep SSL verification enabled by default, clearly warn users of the risks when using --no-check-certificate, and consider removing or deprecating this flag. Implement certificate pinning for critical endpoints.
  • Medium · Hardcoded API Keys and Credentials Risk in Extractors — youtube_dl/extractor/ (various extractor files). The large number of site extractors (youtube_dl/extractor/) commonly contain hardcoded API keys, client IDs, client secrets, and other credentials used to interact with video platform APIs. These credentials, if compromised or rotated by the platform, could expose the application or users to unauthorized access or service disruption. Fix: Move API credentials to configuration files or environment variables. Implement a secrets management strategy. Regularly rotate credentials and monitor for unauthorized use.
  • Medium · Insecure Temporary File Handling — youtube_dl/ (file download logic). During download operations, youtube-dl creates temporary files using potentially predictable naming patterns. On multi-user systems, this could lead to symlink attacks or race conditions (TOCTOU) where an attacker replaces the temporary file before it is moved to its final destination. Fix: Use Python's tempfile module with secure defaults (mkstemp) to create temporary files with unpredictable names and proper permissions. Ensure atomic file operations when moving temporary files to their final destinations.
  • Medium · GitHub Actions CI Pipeline Without OIDC or Pinned Actions — .github/workflows/ci.yml. The CI workflow in .github/workflows/ci.yml may use GitHub Actions without pinning action versions to specific commit SHAs. Unpinned actions (using branch or tag references like @main or @v1) can be compromised if the referenced repository is taken over, leading to supply chain attacks in the build pipeline. Fix: Pin all GitHub Actions to specific commit SHAs instead of mutable tags or branch names. Use tools like Dependabot or Renovate to keep pinned versions updated

LLM-derived; treat as a starting point, not a security audit.

Where to read next


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

WAIT · ytdl-org/youtube-dl — RepoPilot Verdict