google/guetzli
Perceptual JPEG encoder
Healthy across all four use cases
Permissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓14 active contributors
- ✓Distributed ownership (top contributor 38% of recent commits)
- ✓Apache-2.0 licensed
Show 3 more →Show less
- ✓CI configured
- ✓Tests present
- ⚠Stale — last commit 3y ago
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/google/guetzli)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/google/guetzli on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: google/guetzli
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/google/guetzli shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- 14 active contributors
- Distributed ownership (top contributor 38% of recent commits)
- Apache-2.0 licensed
- CI configured
- Tests present
- ⚠ Stale — last commit 3y ago
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live google/guetzli
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/google/guetzli.
What it runs against: a local clone of google/guetzli — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in google/guetzli | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 1143 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of google/guetzli. If you don't
# have one yet, run these first:
#
# git clone https://github.com/google/guetzli.git
# cd guetzli
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of google/guetzli and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "google/guetzli(\\.git)?\\b" \\
&& ok "origin remote is google/guetzli" \\
|| miss "origin remote is not google/guetzli (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "guetzli/processor.cc" \\
&& ok "guetzli/processor.cc" \\
|| miss "missing critical file: guetzli/processor.cc"
test -f "guetzli/butteraugli_comparator.cc" \\
&& ok "guetzli/butteraugli_comparator.cc" \\
|| miss "missing critical file: guetzli/butteraugli_comparator.cc"
test -f "guetzli/jpeg_data_encoder.cc" \\
&& ok "guetzli/jpeg_data_encoder.cc" \\
|| miss "missing critical file: guetzli/jpeg_data_encoder.cc"
test -f "guetzli/quantize.cc" \\
&& ok "guetzli/quantize.cc" \\
|| miss "missing critical file: guetzli/quantize.cc"
test -f "guetzli/fdct.cc" \\
&& ok "guetzli/fdct.cc" \\
|| miss "missing critical file: guetzli/fdct.cc"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 1143 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~1113d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/google/guetzli"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
Guetzli is a perceptual JPEG encoder from Google that prioritizes visual quality while achieving 20-30% better compression than libjpeg through advanced psychovisual modeling. It uses the Butteraugli perceptual metric to optimize JPEG encoding, producing sequential (non-progressive) JPEGs with excellent quality-to-filesize ratios. Monolithic encoder: guetzli/ contains the core algorithm split into logical modules (DCT transforms in dct_double.cc, Huffman coding in jpeg_huffman_decode.cc, perceptual comparison via butteraugli_comparator.cc). guetzli.cc serves as the main entry point; processor.cc orchestrates the encoding pipeline; JPEG data I/O is modularized into reader/writer/encoder pairs.
👥Who it's for
Image compression engineers, CDN operators, and web performance teams who need to reduce JPEG file sizes without sacrificing visual quality in production image pipelines. Developers integrating batch image processing or optimizing web asset delivery use Guetzli as a command-line tool.
🌱Maturity & risk
Production-ready and actively maintained by Google. The project has CI/CD configured (Travis CI, AppVeyor), builds on multiple platforms (POSIX, Windows, macOS), and provides Bazel/Make/MSVC build paths. Last activity and test coverage suggest stable, mature codingcapable of handling real workloads.
Low risk: minimal external dependencies (only libpng + zlib via Bazel externals), single maintainer model typical of Google research tools, but no obvious open issue tracking visible in provided data. CPU-intensive encoding (by design) may cause performance concerns on resource-constrained systems.
Active areas of work
No specific recent activity evident from file list, but the project maintains working CI/CD and cross-platform builds. Focus appears to be on stability rather than feature development—typical for a mature compression tool.
🚀Get running
git clone https://github.com/google/guetzli.git
cd guetzli
apt-get install libpng-dev # On Ubuntu; see README for macOS/Fedora/Alpine
make
./bin/Release/guetzli input.png output.jpg
Daily commands:
make clean && make # Compiles to bin/Release/guetzli
bin/Release/guetzli --quality 95 input.png output.jpg
Or use Bazel: bazel build //:guetzli && ./bazel-bin/guetzli --quality 95 input.png output.jpg
🗺️Map of the codebase
guetzli/processor.cc— Main JPEG encoding pipeline orchestrator; contains the core loop that drives image quality optimization and compression.guetzli/butteraugli_comparator.cc— Perceptual quality metric implementation using Butteraugli; essential for guiding the encoder's optimization decisions.guetzli/jpeg_data_encoder.cc— Converts internal image representation to JPEG bitstream; critical for final output quality and correctness.guetzli/quantize.cc— Quantization table optimization; directly impacts compression ratio and visual quality tradeoff.guetzli/fdct.cc— Forward DCT transformation; foundational signal processing step for JPEG encoding.third_party/butteraugli/butteraugli/butteraugli.cc— Google's perceptual difference metric library; required dependency for quality assessment.guetzli/guetzli.cc— CLI entry point; demonstrates how to invoke the encoder and handle I/O.
🛠️How to make changes
Implement a new Comparator (quality metric) backend
- Create a new header implementing the Comparator interface in guetzli/comparator.h (pure virtual methods: SimilarityMap, ScoreMap, etc.) (
guetzli/comparator.h) - Implement the comparator class (e.g., guetzli/custom_comparator.cc) with your quality algorithm (
guetzli/custom_comparator.cc) - Wire the new comparator into Processor::ProcessJpegData() around line ~200 where ButteraugliBytesComparator is instantiated (
guetzli/processor.cc) - Update guetzli.cc CLI to accept a --comparator flag that selects your backend (
guetzli/guetzli.cc) - Add unit tests in tests/ that validate comparator output against golden images (
tests/golden_test.sh)
Add support for a new input image format (e.g., TIFF)
- Create guetzli/image_reader_tiff.cc implementing PNG-like I/O (ImageReader interface pattern) (
guetzli/image_reader_tiff.cc) - Follow the PNG reading pattern in jpeg_data_reader.cc to populate RGB pixel buffers and dimensions (
guetzli/jpeg_data_reader.cc) - Update guetzli.cc to detect file extension and dispatch to the appropriate reader (
guetzli/guetzli.cc) - Add TIFF library dependency to BUILD and Makefile (external/tiff.BUILD pattern) (
BUILD)
Optimize quantization table selection for specific image types
- Study quantize.cc's SelectQuantizationTable() and ComputeJpegQuantization() functions to understand current heuristics (
guetzli/quantize.cc) - Implement a new quantization strategy class (e.g., inheriting from existing quantize patterns) in quantize.cc (
guetzli/quantize.cc) - Hook new strategy into Processor::ProcessJpegData() where quantization tables are generated (
guetzli/processor.cc) - Add a quality.cc/quality.h helper for your new table selection logic if needed (
guetzli/quality.cc) - Validate improvements by comparing output size/quality vs. baseline using butteraugli-compare.py (
tools/guetzli-compare.py)
🔧Why these technologies
- Butteraugli perceptual metric — Enables optimization toward human visual quality rather than MSE; foundation for achieving superior compression density
- Iterative quantization refinement — Closed-loop feedback allows precise tuning of quality vs. file size without exhaustive search
- Forward & Inverse DCT — Standard JPEG frequency domain transformation; required for coefficient analysis and lossless reconstruction
- Huffman entropy coding — Compresses quantized coefficients further; standard JPEG requirement
- Chroma subsampling — Exploits human color perception insensitivity; reduces data volume with minimal quality loss
⚖️Trade-offs already made
-
Sequential (non-progressive) JPEG only
- Why: Progressive JPEG adds encoding complexity and decoder latency
- Consequence: Smaller file size and faster decompression, but users cannot see incremental image load in browsers
-
CPU-intensive iterative optimization
- Why: Perceptual feedback loop requires multiple encode/decode cycles
- Consequence: Encoding is 10–100× slower than libjpeg, but output is 20–30% smaller; suitable for batch/offline workflows
-
PNG input only (via libpng)
- Why: PNG is lossless and maintains color fidelity; simplifies transcoding pipeline
- Consequence: Users must convert JPEG/other formats to PNG first; not a universal image converter
-
Bundled Butter
- Why: undefined
- Consequence: undefined
🪤Traps & gotchas
Guetzli is single-input (PNG only, despite JPEG output)—raw/BMP/TIFF inputs will fail silently or error. Encoding is CPU-bound and can take minutes on large images; no built-in timeout. Quality parameter (typically 75–95) is non-standard; higher values consume more computation without guaranteed visual improvement. Butteraugli metric tuning is opaque and not exposed as CLI flags—modifications require recompilation.
🏗️Architecture
💡Concepts to learn
- Discrete Cosine Transform (DCT) — Foundation of JPEG compression—guetzli/fdct.cc and guetzli/dct_double.cc implement forward and inverse DCT to convert spatial image data into frequency coefficients that exploit human visual perception
- Butteraugli Perceptual Metric — Guetzli's secret sauce—a psychovisual quality metric that quantifies perceived differences between images (butteraugli_comparator.cc), enabling optimization toward imperceptible compression rather than peak JPEG quality alone
- Huffman Coding — Final entropy encoding stage (entropy_encode.cc, jpeg_huffman_decode.cc) that assigns variable-length codes to quantized DCT coefficients, further reducing JPEG file size
- Quantization Tables — The core trade-off in lossy JPEG compression—guetzli/quality.cc generates adaptive quantization tables that preserve perceptually important frequencies while aggressively compressing imperceptible detail
- Sequential vs. Progressive JPEG — Guetzli intentionally outputs only sequential JPEGs (not progressive) for faster decompression; understanding this trade-off is essential to Guetzli's design philosophy and limitations
- Chroma Subsampling — JPEG exploit of human color perception weakness—guetzli likely uses 4:2:0 or similar schemes (visible in color_transform.h and preprocess_downsample.cc) to reduce color channel data without perceptual loss
- Iterative Quantization Optimization — Guetzli's core algorithm (processor.cc) repeatedly adjusts quantization parameters and re-encodes, using Butteraugli feedback to converge on minimum file size at target visual quality—computationally expensive but yields superior compression
🔗Related repos
libjpeg-turbo/libjpeg-turbo— Direct competitor: faster JPEG codec, but Guetzli prioritizes compression ratio over speed using perceptual optimizationmozilla/mozjpeg— Alternative perceptual JPEG encoder; Mozilla's research tool similar in philosophy to Guetzli but with different quality trade-offsgoogle/butteraugli— Standalone perceptual metric library powering Guetzli's quality evaluation; can be used independently for image comparisongoogle/brotli— Sibling Google compression research project; applies similar algorithmic rigor to generic data compression (not image-specific)ImageMagick/ImageMagick— Image processing suite that can invoke Guetzli as an external encoder; common integration point for batch workflows
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive unit tests for JPEG encoding pipeline
The repo has minimal test coverage - only smoke_test.sh and golden_test.sh exist. Critical encoding modules like quantize.cc, entropy_encode.cc, fdct.cc, and jpeg_data_encoder.cc lack unit tests. This would catch regressions in the perceptual quality algorithm and ensure correctness of DCT/IDCT transformations across different input scenarios.
- [ ] Create tests/quantize_test.cc with test cases for different quality levels and color spaces
- [ ] Create tests/entropy_encode_test.cc testing Huffman encoding edge cases
- [ ] Create tests/fdct_test.cc and tests/idct_test.cc verifying DCT round-trip accuracy with golden data
- [ ] Create tests/jpeg_data_encoder_test.cc for JPEG structure validation
- [ ] Integrate new tests into BUILD and guetzli.make files
Add GitHub Actions CI workflow alongside Travis CI
The repo currently uses .travis.yml and appveyor.yml (legacy CI systems). GitHub Actions would provide faster, more integrated CI/CD. The incomplete README build section suggests documentation gaps around modern build processes. This would modernize the repo and catch Windows/macOS/Linux regressions faster.
- [ ] Create .github/workflows/build-test.yml with matrix testing for Ubuntu, macOS, Windows
- [ ] Include bazel build, make build, and CMake build paths (guetzli.make, guetzli.vcxproj exist but not CMake)
- [ ] Add fuzz_target.cc execution in CI to catch undefined behavior
- [ ] Deprecate .travis.yml in favor of new workflow
- [ ] Update README.md build section with GitHub Actions badge
Add integration tests for butteraugli perceptual quality metrics
The repo's core value proposition is perceptual quality via butteraugli (third_party/butteraugli/), but tests/golden_test.sh only checks checksums. There are no tests validating that butteraugli_comparator.cc actually improves quality scores or that quality.cc produces expected results for known images. This is critical for verifying the 20-30% compression claim.
- [ ] Create tests/quality_metric_test.cc to validate butteraugli_comparator scoring on standard test images
- [ ] Create tests/quality_encoding_test.cc comparing output quality between different guetzli quality settings using butteraugli scores
- [ ] Add reference golden images (e.g., bees.png variations) with expected butteraugli scores in tests/quality_golden_values.txt
- [ ] Create tests/quality_test.sh shell script to run encoding with --quality flags and measure perceptual gain
- [ ] Document expected quality vs file size tradeoffs in CONTRIBUTING.md
🌿Good first issues
- Add support for JPEG input files (currently PNG-only): extend jpeg_data_decoder.cc to populate output_image.cc, then modify guetzli.cc main() to detect input format.
- Implement CLI flags for Butteraugli threshold tuning: expose butteraugli_comparator.cc's internal thresholds (search guetzli/processor.cc for hardcoded values) as command-line arguments.
- Write unit tests for DCT transforms: guetzli/dct_double.cc and guetzli/fdct.cc have no visible test coverage; add tests/dct_test.cc comparing against reference implementations.
⭐Top contributors
Click to expand
Top contributors
📝Recent commits
Click to expand
Recent commits
214f2bb— Merge pull request #260 from ruuda/bazel025 (jan-wassenberg)d5545ad— Ensure Java 1.8 is available on Darwin on CI (ruuda)fe2dd48— Use Bazel 1.1.0 on CI (ruuda)0a9e0d4— Fix build for Bazel 0.25 and later (ruuda)0b78c7c— Pik improvements backported to guetzli. High-frequency asymmetry for improved ringing detection. (jyrkialakuijala)cb5e4a8— Improve lower quality behaviour. (jyrkialakuijala)32eb462— Update Butteraugli. (robryk)de70ac2— Avoid more conflicting Java packages on Travis CI (ruuda)2266ffa— Fix build on Travis CI (ruuda)f3e83a7— Merge pull request #213 from szabadka/master (szabadka)
🔒Security observations
Guetzli is a relatively secure JPEG encoder with a focused, well-structured codebase. The main security concerns are common to C++ image processing libraries: potential buffer overflows and integer overflows in binary data handling, especially when processing untrusted input. The project shows good awareness of fuzzing (fuzz_target.cc present), but explicit bounds checking documentation is not visible. External dependency management could be improved with explicit version pinning. Overall security posture is good for a specialized image processing library, but input validation and safe integer arithmetic should be prioritized.
- Medium · Potential Buffer Overflow in JPEG Processing —
guetzli/jpeg_data_decoder.cc, guetzli/jpeg_huffman_decode.cc, guetzli/jpeg_data_reader.cc. The codebase contains multiple JPEG data processing files (jpeg_data_decoder.cc, jpeg_huffman_decode.cc, etc.) that handle binary image data. Without visible bounds checking in the file structure, there is risk of buffer overflow vulnerabilities when processing malformed or malicious JPEG files. Fix: Implement comprehensive bounds checking and validation for all binary data parsing. Use safe buffer operations and consider fuzzing with malformed inputs. The existing fuzz_target.cc suggests awareness of this issue - ensure all entry points are similarly protected. - Medium · External Dependencies Without Version Pinning —
external/png.BUILD, external/zlib.BUILD, BUILD, WORKSPACE. The build configuration references external dependencies (libpng, zlib) through BUILD files (external/png.BUILD, external/zlib.BUILD) without explicit version specifications visible in the provided files. This could lead to unexpected behavior from dependency updates. Fix: Pin specific versions of all external dependencies (libpng, zlib) in the WORKSPACE and BUILD files. Document minimum version requirements and perform regular security audits of dependencies. - Low · Potential Integer Overflow in Image Processing —
guetzli/preprocess_downsample.cc, guetzli/output_image.cc, guetzli/dct_double.cc. Image dimension calculations in files like preprocess_downsample.cc, output_image.cc, and dct_double.cc may be vulnerable to integer overflow when processing very large image dimensions, potentially leading to heap corruption. Fix: Implement safe integer arithmetic checks for all image dimension calculations. Validate that width * height and related calculations do not overflow. Add maximum dimension limits. - Low · Missing Input Validation in Main Entry Point —
guetzli/guetzli.cc, guetzli/processor.cc. The guetzli.cc main encoder file likely accepts file paths and image data without comprehensive validation visible in the file structure, potentially allowing path traversal or resource exhaustion attacks. Fix: Implement strict input validation for all file paths (reject relative paths, symlinks). Add resource limits for image dimensions and processing time. Validate all external input before processing. - Low · Lack of Security Documentation —
Repository root. While CONTRIBUTING.md exists, there is no visible SECURITY.md file or documented security policies for reporting vulnerabilities in this active open-source project. Fix: Create a SECURITY.md file documenting the security policy, how to responsibly report vulnerabilities, and the project's approach to security updates.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.