kkroening/ffmpeg-python
Python bindings for FFmpeg - with complex filtering support
Healthy across all four use cases
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓21+ active contributors
- ✓Apache-2.0 licensed
- ✓CI configured
Show all 6 evidence items →Show less
- ✓Tests present
- ⚠Stale — last commit 2y ago
- ⚠Concentrated ownership — top contributor handles 61% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/kkroening/ffmpeg-python)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/kkroening/ffmpeg-python on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: kkroening/ffmpeg-python
Generated by RepoPilot · 2026-05-07 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/kkroening/ffmpeg-python shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- 21+ active contributors
- Apache-2.0 licensed
- CI configured
- Tests present
- ⚠ Stale — last commit 2y ago
- ⚠ Concentrated ownership — top contributor handles 61% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live kkroening/ffmpeg-python
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/kkroening/ffmpeg-python.
What it runs against: a local clone of kkroening/ffmpeg-python — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in kkroening/ffmpeg-python | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 671 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of kkroening/ffmpeg-python. If you don't
# have one yet, run these first:
#
# git clone https://github.com/kkroening/ffmpeg-python.git
# cd ffmpeg-python
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of kkroening/ffmpeg-python and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "kkroening/ffmpeg-python(\\.git)?\\b" \\
&& ok "origin remote is kkroening/ffmpeg-python" \\
|| miss "origin remote is not kkroening/ffmpeg-python (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "ffmpeg/__init__.py" \\
&& ok "ffmpeg/__init__.py" \\
|| miss "missing critical file: ffmpeg/__init__.py"
test -f "ffmpeg/_ffmpeg.py" \\
&& ok "ffmpeg/_ffmpeg.py" \\
|| miss "missing critical file: ffmpeg/_ffmpeg.py"
test -f "ffmpeg/nodes.py" \\
&& ok "ffmpeg/nodes.py" \\
|| miss "missing critical file: ffmpeg/nodes.py"
test -f "ffmpeg/_run.py" \\
&& ok "ffmpeg/_run.py" \\
|| miss "missing critical file: ffmpeg/_run.py"
test -f "ffmpeg/_filters.py" \\
&& ok "ffmpeg/_filters.py" \\
|| miss "missing critical file: ffmpeg/_filters.py"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 671 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~641d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/kkroening/ffmpeg-python"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
ffmpeg-python is a Python wrapper for FFmpeg that exposes complex filter graph construction as fluent, Pythonic objects instead of command-line strings. It solves the problem of building intricate media processing pipelines (video/audio filtering, concatenation, overlaying, frame trimming) without writing cryptic -filter_complex bash commands — allowing developers to express signal graphs as readable Python expressions that compile to optimal FFmpeg invocations. Single-package structure: ffmpeg/ module exports the core Stream/DAG abstraction; examples/ contains runnable demos (Jupyter notebooks, standalone scripts); doc/ holds Sphinx-generated HTML docs and assets. No monorepo; the entire binding is in one Python package that dynamically generates FFmpeg filter methods at runtime.
👥Who it's for
Video engineers, media processing developers, and data scientists who need to programmatically compose FFmpeg filters (e.g., building video thumbnails, applying effects chains, or constructing multi-input signal graphs) without deep FFmpeg CLI expertise.
🌱Maturity & risk
Actively maintained but moderately mature: the repo has CI/CD via GitHub Actions (.github/workflows/ci.yml), comprehensive documentation in doc/html/, and real-world examples (examples/get_video_thumbnail.py, examples/facetime.py), suggesting production use is viable. However, the single-maintainer model (kkroening) and Python-only binding approach (no C extension) may limit some advanced FFmpeg features.
Moderate risk: depends on system-installed FFmpeg (subprocess-based, not FFmpeg libraries), making it fragile across environments and FFmpeg versions. No visible lock file or pinned dependency versions in repo root; only indirect dependencies listed (gevent, Pillow, etc.) without explicit version constraints. Single active maintainer increases bus factor; check GitHub Issues for stale tickets before production adoption.
Active areas of work
No recent commit data visible in the file list provided, but the presence of ci.yml suggests active CI. Review GitHub Actions workflow runs and recent commits on the main branch to assess current activity; check open PRs for in-flight work.
🚀Get running
git clone https://github.com/kkroening/ffmpeg-python.git
cd ffmpeg-python
pip install -e .
# Requires FFmpeg to be installed system-wide (apt install ffmpeg, brew install ffmpeg, etc.)
python examples/get_video_thumbnail.py
Daily commands:
No Makefile or setup.py visible in file list; install via pip install -e . then execute scripts directly: python examples/get_video_thumbnail.py or instantiate in a Jupyter notebook (see examples/ffmpeg-numpy.ipynb for interactive demo).
🗺️Map of the codebase
ffmpeg/__init__.py— Entry point exporting the public API; every user interaction starts hereffmpeg/_ffmpeg.py— Core stream and filter abstraction; contains the fundamental Stream class and filter chain logicffmpeg/nodes.py— Graph node representation and DAG construction; critical for understanding filter graph architectureffmpeg/_run.py— FFmpeg process execution and output handling; bridges Python abstractions to actual ffmpeg binaryffmpeg/_filters.py— Auto-generated filter bindings and dynamic filter application; enables fluent filter APIffmpeg/dag.py— Directed acyclic graph construction and traversal; ensures correct filter compilation orderffmpeg/_probe.py— FFmpeg probe integration for media introspection; critical dependency for get_video_thumbnail and info examples
🧩Components & responsibilities
- Stream & FilterableStream (Python class with getattr dynamic dispatch) — Fluent interface for chaining filters and outputs; tracks parent node references
- Failure mode: Invalid filter names or arguments are not caught until .run() time
- DAG (nodes.py, dag.py) (Graph data structure with edge tracking) — Represents filter graph topology; performs topological sort to ensure correct execution order
- Failure mode: Cycles in the filter graph will cause topological sort to fail at run time
- Command Compilation (_run.py, _utils.py) (String building with shell-safe escaping) — Converts DAG into FFmpeg command-line arguments; handles escaping and pad routing syntax
- Failure mode: Malformed filter syntax or pad mismatches result in FFmpeg stderr; library propagates process exit code
- Process Execution (_run.py) (subprocess.Popen with pipe redirection) — Spawns FFmpeg subprocess, manages stdin/stdout/stderr pipes, returns exit code
- Failure mode: FFmpeg binary not found, out of memory, or permission denied—library raises OSError or CalledProcessError
- Probe (_probe.py) (subprocess + JSON parsing) — Invokes FFmpeg with -show_format and -show_streams to extract media metadata
- Failure mode: Invalid media file or FFmpeg unavailable raises exception; missing streams key handled gracefully
🔀Data flow
Python user code→Stream objects (ffmpeg— undefined
🛠️How to make changes
Add a new built-in filter function
- Filters are auto-generated from FFmpeg; verify the filter exists in your ffmpeg binary (
ffmpeg/_filters.py) - Call ffmpeg.filter_name(stream, *args, **kwargs) — the filter will be dynamically generated via getattr (
ffmpeg/_ffmpeg.py) - The Stream class will automatically create a FilterNode and chain it in the DAG (
ffmpeg/nodes.py)
Build a complex multi-input/multi-output filter graph
- Create input streams with ffmpeg.input('file1') and ffmpeg.input('file2') (
ffmpeg/__init__.py) - Apply filters and use node-level operations (e.g., stream['v'] for video pad routing) (
ffmpeg/_ffmpeg.py) - Chain multiple outputs with ffmpeg.output(stream1, 'out1.mp4').output(stream2, 'out2.mp4') (
ffmpeg/nodes.py) - Call .run() which traverses the DAG, compiles the command, and executes (
ffmpeg/_run.py)
Execute FFmpeg with custom arguments or capture output
- Build your filter graph normally with input(), filters, output() (
ffmpeg/__init__.py) - Call .compile() instead of .run() to get the command list without executing (
ffmpeg/_run.py) - Pass cmd, capture_stdout=True, capture_stderr=True to run() for pipe-based I/O (
ffmpeg/_run.py) - Inspect the returned process object (stdout, stderr) for real-time data (e.g., frame data, logs) (
ffmpeg/_run.py)
Query media properties before processing
- Call ffmpeg.probe('input.mp4') to inspect streams and codec details (
ffmpeg/_probe.py) - Returned dict contains 'streams' array with width, height, duration, codec_name, etc. (
ffmpeg/_probe.py) - Use this metadata to conditionally apply filters or set output parameters (
examples/video_info.py)
🔧Why these technologies
- Python subprocess module — Spawns the FFmpeg binary without re-implementing its codec/filter logic; leverages native FFmpeg performance
- Directed Acyclic Graph (DAG) — Encodes complex filter topologies (e.g., overlay, split, concat) and ensures correct compilation order
- Dynamic method generation (getattr) — Binds any FFmpeg filter dynamically without hardcoding; auto-scales as FFmpeg gains new filters
- FFmpeg probe JSON output — Enables media introspection (resolution, codecs, duration) without parsing stderr or writing temp files
⚖️Trade-offs already made
-
Lazy filter binding via getattr instead of pre-generated stubs
- Why: Reduces maintenance burden and keeps library in sync with FFmpeg releases automatically
- Consequence: IDE auto-completion for filters is limited; developers must refer to FFmpeg docs
-
Graph compilation happens at .run() time, not at filter-chain definition time
- Why: Allows late binding and composition of complex graphs with multiple sources/sinks
- Consequence: Errors in filter logic only surface at execution; no early validation
-
Wraps FFmpeg subprocess rather than embedding or reimplementing codecs
- Why: Simplifies maintenance; users must have FFmpeg installed independently
- Consequence: Library size is minimal, but deployment requires external binary dependency
🚫Non-goals (don't propose these)
- Does not provide GUI or interactive editing of filter graphs
- Does not handle authentication or DRM-protected media
- Does not re-encode or transcode on behalf of users (only constructs commands)
- Does not manage FFmpeg installation or version compatibility (users install FFmpeg separately)
- Not intended for real-time streaming at sub-second latency
🪤Traps & gotchas
System dependency: FFmpeg must be installed on the system PATH at runtime; the library does not bundle or manage FFmpeg installation. Version mismatch risk: Different FFmpeg versions have different filter names/parameters; subprocess-based invocation can silently fail or produce cryptic FFmpeg errors. DAG validation: No obvious static validation of filter chains; invalid graphs may only error at run-time when ffmpeg binary executes. Streaming I/O: Piping large videos through Python may incur buffering overhead (check subprocess pipe handling in run() implementation).
🏗️Architecture
💡Concepts to learn
- Directed Acyclic Graph (DAG) — ffmpeg-python represents filter chains as DAGs to support multi-input/multi-output scenarios (concat, overlay, side-by-side) and compile them to FFmpeg's filter_complex syntax; understanding DAG topology is essential for debugging complex graphs.
- FFmpeg filtergraph syntax and filter_complex — The library generates FFmpeg's -filter_complex argument strings under the hood; knowing the syntax (pad notation [0], filter chains with ;, concat with [v0][v1]concat=n=2) helps debug generated commands and understand limitations.
- Fluent/Builder Interface — ffmpeg-python uses method chaining (input().hflip().output().run()) for readability; this is a core design pattern that makes the API Pythonic but requires understanding how intermediate Stream objects compose.
- Lazy Evaluation and Code Generation — Stream objects do not execute FFmpeg immediately; the entire graph is built in Python, then compiled to a bash command string that is only executed on .run(); this allows inspection and optimization before invocation.
- Dynamic Method Binding via getattr — FFmpeg filters (hflip, concat, overlay, etc.) are not hard-coded as methods; instead, Stream.getattr dynamically creates them at runtime, allowing the library to support all FFmpeg filters without explicit definitions.
- Subprocess and Pipe-Based I/O — ffmpeg-python invokes FFmpeg as a subprocess and pipes large video streams through stdin/stdout; understanding subprocess buffering, pipe limits, and stream handling is crucial for avoiding deadlocks or memory bloat on large files.
- Signal Graph / Multi-Input Filtering — Real-world video compositing (overlaying, side-by-side, picture-in-picture) requires merging multiple input streams with precise alignment; the README's graph1.png example shows how ffmpeg-python elegantly models these complex topologies.
🔗Related repos
imageio/imageio-ffmpeg— Alternative FFmpeg wrapper focused on image I/O; does not expose filter graphs but is lighter-weight for simple video read/writeskvideo/scikit-video— Scientific video I/O with some filter support; overlaps on video frame access and processing but lacks complex filter graph compositionkkroening/ffmpeg-python-examples— Likely companion repo with extended real-world examples (if it exists; check GitHub for kkroening's other repos)fluent-ffmpeg/node-fluent-ffmpeg— JavaScript equivalent with fluent API; reference for cross-language design patterns in FFmpeg wrappingopenai/whisper— Common use case: ffmpeg-python often used to preprocess audio inputs for Whisper transcription pipelines
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive unit tests for ffmpeg/_filters.py
The repo lacks visible test files in the structure, but ffmpeg/_filters.py likely contains auto-generated filter bindings that need validation. A new contributor could add unit tests covering filter chain composition, parameter passing, and common filter combinations (scale, overlay, concat, etc.) to catch regressions in filter graph construction.
- [ ] Create tests/ directory with test_filters.py
- [ ] Add tests for basic filters (hflip, scale, fps) from ffmpeg/_filters.py
- [ ] Add tests for complex filter chains (e.g., split + overlay combinations)
- [ ] Test filter parameter validation and type coercion
- [ ] Integrate tests into .github/workflows/ci.yml if not already present
Add type hints and mypy validation to core modules
The codebase (ffmpeg/init.py, ffmpeg/_ffmpeg.py, ffmpeg/_run.py, ffmpeg/_utils.py) lacks type annotations, making it harder for users to understand APIs and catch errors early. Adding gradual type hints with mypy CI validation would improve developer experience and catch bugs.
- [ ] Add type hints to ffmpeg/_run.py (run() and compile() functions)
- [ ] Add type hints to ffmpeg/_utils.py utility functions
- [ ] Add type hints to ffmpeg/_ffmpeg.py Stream class methods
- [ ] Create pyproject.toml or setup.cfg with mypy configuration
- [ ] Add mypy check to .github/workflows/ci.yml
Create integration tests demonstrating real-world examples from examples/ directory
The examples/ directory has several real-world scripts (video_info.py, get_video_thumbnail.py, split_silence.py, tensorflow_stream.py) but no corresponding integration tests. New contributor could create automated tests validating these examples work correctly, catching API regressions.
- [ ] Create tests/integration_test_examples.py
- [ ] Add test for examples/video_info.py with sample media input
- [ ] Add test for examples/get_video_thumbnail.py output validation
- [ ] Add test for examples/show_progress.py with timeout handling
- [ ] Mock or fixture external dependencies (tensorflow, google-cloud-speech) where needed
- [ ] Document how to run integration tests in CONTRIBUTING.md
🌿Good first issues
- Add type hints to the main Stream class and filter methods in ffmpeg/init.py; currently no .pyi stubs visible, making IDE autocomplete poor for discovered methods.
- Create a test suite (no tests/ directory visible) covering basic Stream construction (input → hflip → output) and complex graphs (concat, overlay) to catch regressions and document expected behavior.
- Expand examples/ with a documented 'recipes' file (e.g., extract frames at intervals, create animated GIF, extract audio track, apply watermark) showing common real-world patterns beyond the current examples.
⭐Top contributors
Click to expand
Top contributors
- @kkroening — 61 commits
- @depau — 9 commits
- @cclauss — 5 commits
- @Jacotsu — 3 commits
- @raulpy271 — 2 commits
📝Recent commits
Click to expand
Recent commits
df129c7— Let's implicitly fix a typo (#681) (cclauss)35886c9— Upgrade GitHub Actions again (#679) (cclauss)ef00863— Fix Black in GHA for Python 2.7 (#680) (kkroening)ed70f2e— Upgrade GitHub Actions (#643) (cclauss)fc41f4a— Fixheigth->heighttypo (#596) (ljhcage)6189cd6— Import ABC from collections.abc for Python 3.9+ compatibility (#330) (tirkarthi)cb9d400— Add FFmpeg installation instructions (#642) (kkroening)29b6f09— Use GitHub Actions for CI. (#641) (kkroening)fd1da13— Re-apply Black formatting, and wrap docstrings at ~88 columns. (#639) (kkroening)f307972— Merge pull request #494 from kkroening/revert-493-revert-430-master (depau)
🔒Security observations
The ffmpeg-python project has moderate security concerns. The primary risk is command injection through unvalidated subprocess execution of FFmpeg commands, which could be exploited if user inputs are not properly sanitized. Additional risks include unrestricted file path access, dynamic filter construction without validation, and unpinned dependency versions. While the project structure is clean and the library serves a specific purpose, security hardening is needed around input validation, subprocess execution, and dependency management. No hardcoded credentials or obvious infrastructure misconfigurations were detected.
- High · Command Injection Risk in FFmpeg Subprocess Execution —
ffmpeg/_run.py. The ffmpeg-python library wraps FFmpeg command execution. Without proper input validation and sanitization in the _run.py module, user-supplied inputs (file paths, filter parameters) could be passed unsanitized to subprocess calls, enabling command injection attacks. Fix: Ensure all user inputs are properly validated and escaped before being passed to subprocess calls. Use subprocess with shell=False and pass arguments as a list rather than concatenated strings. Implement strict input validation for file paths and filter parameters. - High · Arbitrary Code Execution via Unsafe Filter Construction —
ffmpeg/_filters.py. The dynamic filter construction in ffmpeg/_filters.py may allow attackers to inject malicious FFmpeg filter syntax if user input is not properly validated. FFmpeg filters support complex syntax that could be exploited. Fix: Implement strict validation and sanitization for all filter parameters. Use a whitelist approach for allowed filter names and parameters. Avoid directly concatenating user input into filter strings. - Medium · Unvalidated File Path Input —
ffmpeg/_ffmpeg.py. The input() and output() functions in ffmpeg/_ffmpeg.py accept file paths without apparent validation. Attackers could potentially specify arbitrary file paths, including those outside intended directories, or use path traversal sequences. Fix: Implement path validation to ensure file operations are restricted to intended directories. Use os.path.abspath() and verify paths don't escape a base directory. Consider using pathlib with resolve() for cleaner path handling. - Medium · Potential Dependency Vulnerabilities —
requirements.txt, setup.py. Several dependencies listed (gevent, Pillow, google-cloud-speech) have historically contained security vulnerabilities. The project does not specify pinned versions in requirements.txt, risking automatic installation of vulnerable versions. Fix: Pin dependency versions to known-good releases in requirements.txt. Regularly scan dependencies with tools like pip-audit, safety, or Snyk. Implement automated dependency updates with security scanning in the CI/CD pipeline. - Medium · Missing Input Type Validation —
ffmpeg/nodes.py, ffmpeg/_run.py. The codebase may lack comprehensive type checking and input validation for stream objects and parameters passed between functions, potentially allowing unexpected types to reach subprocess execution. Fix: Implement strict type checking for all public API functions. Use type hints with mypy for static analysis. Add runtime validation for critical parameters before they reach subprocess calls. - Low · Lack of Security Documentation —
README.md, doc/src/index.rst. The README and documentation do not include security best practices or warnings about safe usage patterns, particularly regarding untrusted input handling. Fix: Add a security section to documentation warning users about command injection risks when using user-supplied inputs. Provide examples of safe and unsafe usage patterns.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.