google/guetzli

Item: google/guetzli
Rating: 5
Author: RepoPilot

Perceptual JPEG encoder

Healthy

Healthy across all four use cases

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓14 active contributors
✓Distributed ownership (top contributor 38% of recent commits)
✓Apache-2.0 licensed

Show 3 more →

✓CI configured
✓Tests present
⚠Stale — last commit 3y ago

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/google/guetzli)](https://repopilot.app/r/google/guetzli)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/google/guetzli on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: google/guetzli

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/google/guetzli shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

14 active contributors
Distributed ownership (top contributor 38% of recent commits)
Apache-2.0 licensed
CI configured
Tests present
⚠ Stale — last commit 3y ago

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live google/guetzli repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/google/guetzli.

What it runs against: a local clone of google/guetzli — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in google/guetzli | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 1143 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>google/guetzli</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of google/guetzli. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/google/guetzli.git
#   cd guetzli
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of google/guetzli and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "google/guetzli(\\.git)?\\b" \\
  && ok "origin remote is google/guetzli" \\
  || miss "origin remote is not google/guetzli (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "guetzli/processor.cc" \\
  && ok "guetzli/processor.cc" \\
  || miss "missing critical file: guetzli/processor.cc"
test -f "guetzli/butteraugli_comparator.cc" \\
  && ok "guetzli/butteraugli_comparator.cc" \\
  || miss "missing critical file: guetzli/butteraugli_comparator.cc"
test -f "guetzli/jpeg_data_encoder.cc" \\
  && ok "guetzli/jpeg_data_encoder.cc" \\
  || miss "missing critical file: guetzli/jpeg_data_encoder.cc"
test -f "guetzli/quantize.cc" \\
  && ok "guetzli/quantize.cc" \\
  || miss "missing critical file: guetzli/quantize.cc"
test -f "guetzli/fdct.cc" \\
  && ok "guetzli/fdct.cc" \\
  || miss "missing critical file: guetzli/fdct.cc"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 1143 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~1113d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/google/guetzli"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Guetzli is a perceptual JPEG encoder from Google that prioritizes visual quality while achieving 20-30% better compression than libjpeg through advanced psychovisual modeling. It uses the Butteraugli perceptual metric to optimize JPEG encoding, producing sequential (non-progressive) JPEGs with excellent quality-to-filesize ratios. Monolithic encoder: guetzli/ contains the core algorithm split into logical modules (DCT transforms in dct_double.cc, Huffman coding in jpeg_huffman_decode.cc, perceptual comparison via butteraugli_comparator.cc). guetzli.cc serves as the main entry point; processor.cc orchestrates the encoding pipeline; JPEG data I/O is modularized into reader/writer/encoder pairs.

👥Who it's for

Image compression engineers, CDN operators, and web performance teams who need to reduce JPEG file sizes without sacrificing visual quality in production image pipelines. Developers integrating batch image processing or optimizing web asset delivery use Guetzli as a command-line tool.

🌱Maturity & risk

Production-ready and actively maintained by Google. The project has CI/CD configured (Travis CI, AppVeyor), builds on multiple platforms (POSIX, Windows, macOS), and provides Bazel/Make/MSVC build paths. Last activity and test coverage suggest stable, mature codingcapable of handling real workloads.

Low risk: minimal external dependencies (only libpng + zlib via Bazel externals), single maintainer model typical of Google research tools, but no obvious open issue tracking visible in provided data. CPU-intensive encoding (by design) may cause performance concerns on resource-constrained systems.

Active areas of work

No specific recent activity evident from file list, but the project maintains working CI/CD and cross-platform builds. Focus appears to be on stability rather than feature development—typical for a mature compression tool.

🚀Get running

git clone https://github.com/google/guetzli.git
cd guetzli
apt-get install libpng-dev  # On Ubuntu; see README for macOS/Fedora/Alpine
make
./bin/Release/guetzli input.png output.jpg

Daily commands:

make clean && make  # Compiles to bin/Release/guetzli
bin/Release/guetzli --quality 95 input.png output.jpg

Or use Bazel: bazel build //:guetzli && ./bazel-bin/guetzli --quality 95 input.png output.jpg

🗺️Map of the codebase

guetzli/processor.cc — Main JPEG encoding pipeline orchestrator; contains the core loop that drives image quality optimization and compression.
guetzli/butteraugli_comparator.cc — Perceptual quality metric implementation using Butteraugli; essential for guiding the encoder's optimization decisions.
guetzli/jpeg_data_encoder.cc — Converts internal image representation to JPEG bitstream; critical for final output quality and correctness.
guetzli/quantize.cc — Quantization table optimization; directly impacts compression ratio and visual quality tradeoff.
guetzli/fdct.cc — Forward DCT transformation; foundational signal processing step for JPEG encoding.
third_party/butteraugli/butteraugli/butteraugli.cc — Google's perceptual difference metric library; required dependency for quality assessment.
guetzli/guetzli.cc — CLI entry point; demonstrates how to invoke the encoder and handle I/O.

🛠️How to make changes

Implement a new Comparator (quality metric) backend

Create a new header implementing the Comparator interface in guetzli/comparator.h (pure virtual methods: SimilarityMap, ScoreMap, etc.) (guetzli/comparator.h)
Implement the comparator class (e.g., guetzli/custom_comparator.cc) with your quality algorithm (guetzli/custom_comparator.cc)
Wire the new comparator into Processor::ProcessJpegData() around line ~200 where ButteraugliBytesComparator is instantiated (guetzli/processor.cc)
Update guetzli.cc CLI to accept a --comparator flag that selects your backend (guetzli/guetzli.cc)
Add unit tests in tests/ that validate comparator output against golden images (tests/golden_test.sh)

Add support for a new input image format (e.g., TIFF)

Create guetzli/image_reader_tiff.cc implementing PNG-like I/O (ImageReader interface pattern) (guetzli/image_reader_tiff.cc)
Follow the PNG reading pattern in jpeg_data_reader.cc to populate RGB pixel buffers and dimensions (guetzli/jpeg_data_reader.cc)
Update guetzli.cc to detect file extension and dispatch to the appropriate reader (guetzli/guetzli.cc)
Add TIFF library dependency to BUILD and Makefile (external/tiff.BUILD pattern) (BUILD)

Optimize quantization table selection for specific image types

Study quantize.cc's SelectQuantizationTable() and ComputeJpegQuantization() functions to understand current heuristics (guetzli/quantize.cc)
Implement a new quantization strategy class (e.g., inheriting from existing quantize patterns) in quantize.cc (guetzli/quantize.cc)
Hook new strategy into Processor::ProcessJpegData() where quantization tables are generated (guetzli/processor.cc)
Add a quality.cc/quality.h helper for your new table selection logic if needed (guetzli/quality.cc)
Validate improvements by comparing output size/quality vs. baseline using butteraugli-compare.py (tools/guetzli-compare.py)

🔧Why these technologies

Butteraugli perceptual metric — Enables optimization toward human visual quality rather than MSE; foundation for achieving superior compression density
Iterative quantization refinement — Closed-loop feedback allows precise tuning of quality vs. file size without exhaustive search
Forward & Inverse DCT — Standard JPEG frequency domain transformation; required for coefficient analysis and lossless reconstruction
Huffman entropy coding — Compresses quantized coefficients further; standard JPEG requirement
Chroma subsampling — Exploits human color perception insensitivity; reduces data volume with minimal quality loss

⚖️Trade-offs already made

Sequential (non-progressive) JPEG only
- Why: Progressive JPEG adds encoding complexity and decoder latency
- Consequence: Smaller file size and faster decompression, but users cannot see incremental image load in browsers
CPU-intensive iterative optimization
- Why: Perceptual feedback loop requires multiple encode/decode cycles
- Consequence: Encoding is 10–100× slower than libjpeg, but output is 20–30% smaller; suitable for batch/offline workflows
PNG input only (via libpng)
- Why: PNG is lossless and maintains color fidelity; simplifies transcoding pipeline
- Consequence: Users must convert JPEG/other formats to PNG first; not a universal image converter
Bundled Butter
- Why: undefined
- Consequence: undefined

🪤Traps & gotchas

Guetzli is single-input (PNG only, despite JPEG output)—raw/BMP/TIFF inputs will fail silently or error. Encoding is CPU-bound and can take minutes on large images; no built-in timeout. Quality parameter (typically 75–95) is non-standard; higher values consume more computation without guaranteed visual improvement. Butteraugli metric tuning is opaque and not exposed as CLI flags—modifications require recompilation.

🏗️Architecture

💡Concepts to learn

Discrete Cosine Transform (DCT) — Foundation of JPEG compression—guetzli/fdct.cc and guetzli/dct_double.cc implement forward and inverse DCT to convert spatial image data into frequency coefficients that exploit human visual perception
Butteraugli Perceptual Metric — Guetzli's secret sauce—a psychovisual quality metric that quantifies perceived differences between images (butteraugli_comparator.cc), enabling optimization toward imperceptible compression rather than peak JPEG quality alone
Huffman Coding — Final entropy encoding stage (entropy_encode.cc, jpeg_huffman_decode.cc) that assigns variable-length codes to quantized DCT coefficients, further reducing JPEG file size
Quantization Tables — The core trade-off in lossy JPEG compression—guetzli/quality.cc generates adaptive quantization tables that preserve perceptually important frequencies while aggressively compressing imperceptible detail
Sequential vs. Progressive JPEG — Guetzli intentionally outputs only sequential JPEGs (not progressive) for faster decompression; understanding this trade-off is essential to Guetzli's design philosophy and limitations
Chroma Subsampling — JPEG exploit of human color perception weakness—guetzli likely uses 4:2:0 or similar schemes (visible in color_transform.h and preprocess_downsample.cc) to reduce color channel data without perceptual loss
Iterative Quantization Optimization — Guetzli's core algorithm (processor.cc) repeatedly adjusts quantization parameters and re-encodes, using Butteraugli feedback to converge on minimum file size at target visual quality—computationally expensive but yields superior compression

libjpeg-turbo/libjpeg-turbo — Direct competitor: faster JPEG codec, but Guetzli prioritizes compression ratio over speed using perceptual optimization
mozilla/mozjpeg — Alternative perceptual JPEG encoder; Mozilla's research tool similar in philosophy to Guetzli but with different quality trade-offs
google/butteraugli — Standalone perceptual metric library powering Guetzli's quality evaluation; can be used independently for image comparison
google/brotli — Sibling Google compression research project; applies similar algorithmic rigor to generic data compression (not image-specific)
ImageMagick/ImageMagick — Image processing suite that can invoke Guetzli as an external encoder; common integration point for batch workflows

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for JPEG encoding pipeline

The repo has minimal test coverage - only smoke_test.sh and golden_test.sh exist. Critical encoding modules like quantize.cc, entropy_encode.cc, fdct.cc, and jpeg_data_encoder.cc lack unit tests. This would catch regressions in the perceptual quality algorithm and ensure correctness of DCT/IDCT transformations across different input scenarios.

[ ] Create tests/quantize_test.cc with test cases for different quality levels and color spaces
[ ] Create tests/entropy_encode_test.cc testing Huffman encoding edge cases
[ ] Create tests/fdct_test.cc and tests/idct_test.cc verifying DCT round-trip accuracy with golden data
[ ] Create tests/jpeg_data_encoder_test.cc for JPEG structure validation
[ ] Integrate new tests into BUILD and guetzli.make files

Add GitHub Actions CI workflow alongside Travis CI

The repo currently uses .travis.yml and appveyor.yml (legacy CI systems). GitHub Actions would provide faster, more integrated CI/CD. The incomplete README build section suggests documentation gaps around modern build processes. This would modernize the repo and catch Windows/macOS/Linux regressions faster.

[ ] Create .github/workflows/build-test.yml with matrix testing for Ubuntu, macOS, Windows
[ ] Include bazel build, make build, and CMake build paths (guetzli.make, guetzli.vcxproj exist but not CMake)
[ ] Add fuzz_target.cc execution in CI to catch undefined behavior
[ ] Deprecate .travis.yml in favor of new workflow
[ ] Update README.md build section with GitHub Actions badge

Add integration tests for butteraugli perceptual quality metrics

The repo's core value proposition is perceptual quality via butteraugli (third_party/butteraugli/), but tests/golden_test.sh only checks checksums. There are no tests validating that butteraugli_comparator.cc actually improves quality scores or that quality.cc produces expected results for known images. This is critical for verifying the 20-30% compression claim.

[ ] Create tests/quality_metric_test.cc to validate butteraugli_comparator scoring on standard test images
[ ] Create tests/quality_encoding_test.cc comparing output quality between different guetzli quality settings using butteraugli scores
[ ] Add reference golden images (e.g., bees.png variations) with expected butteraugli scores in tests/quality_golden_values.txt
[ ] Create tests/quality_test.sh shell script to run encoding with --quality flags and measure perceptual gain
[ ] Document expected quality vs file size tradeoffs in CONTRIBUTING.md

🌿Good first issues

Add support for JPEG input files (currently PNG-only): extend jpeg_data_decoder.cc to populate output_image.cc, then modify guetzli.cc main() to detect input format.
Implement CLI flags for Butteraugli threshold tuning: expose butteraugli_comparator.cc's internal thresholds (search guetzli/processor.cc for hardcoded values) as command-line arguments.
Write unit tests for DCT transforms: guetzli/dct_double.cc and guetzli/fdct.cc have no visible test coverage; add tests/dct_test.cc comparing against reference implementations.

⭐Top contributors

Click to expand

@robryk — 38 commits
@szabadka — 22 commits
@hughbe — 11 commits
@eustas — 6 commits
@ruuda — 5 commits

📝Recent commits

Click to expand

214f2bb — Merge pull request #260 from ruuda/bazel025 (jan-wassenberg)
d5545ad — Ensure Java 1.8 is available on Darwin on CI (ruuda)
fe2dd48 — Use Bazel 1.1.0 on CI (ruuda)
0a9e0d4 — Fix build for Bazel 0.25 and later (ruuda)
0b78c7c — Pik improvements backported to guetzli. High-frequency asymmetry for improved ringing detection. (jyrkialakuijala)
cb5e4a8 — Improve lower quality behaviour. (jyrkialakuijala)
32eb462 — Update Butteraugli. (robryk)
de70ac2 — Avoid more conflicting Java packages on Travis CI (ruuda)
2266ffa — Fix build on Travis CI (ruuda)
f3e83a7 — Merge pull request #213 from szabadka/master (szabadka)

🔒Security observations

Guetzli is a relatively secure JPEG encoder with a focused, well-structured codebase. The main security concerns are common to C++ image processing libraries: potential buffer overflows and integer overflows in binary data handling, especially when processing untrusted input. The project shows good awareness of fuzzing (fuzz_target.cc present), but explicit bounds checking documentation is not visible. External dependency management could be improved with explicit version pinning. Overall security posture is good for a specialized image processing library, but input validation and safe integer arithmetic should be prioritized.

Medium · Potential Buffer Overflow in JPEG Processing — guetzli/jpeg_data_decoder.cc, guetzli/jpeg_huffman_decode.cc, guetzli/jpeg_data_reader.cc. The codebase contains multiple JPEG data processing files (jpeg_data_decoder.cc, jpeg_huffman_decode.cc, etc.) that handle binary image data. Without visible bounds checking in the file structure, there is risk of buffer overflow vulnerabilities when processing malformed or malicious JPEG files. Fix: Implement comprehensive bounds checking and validation for all binary data parsing. Use safe buffer operations and consider fuzzing with malformed inputs. The existing fuzz_target.cc suggests awareness of this issue - ensure all entry points are similarly protected.
Medium · External Dependencies Without Version Pinning — external/png.BUILD, external/zlib.BUILD, BUILD, WORKSPACE. The build configuration references external dependencies (libpng, zlib) through BUILD files (external/png.BUILD, external/zlib.BUILD) without explicit version specifications visible in the provided files. This could lead to unexpected behavior from dependency updates. Fix: Pin specific versions of all external dependencies (libpng, zlib) in the WORKSPACE and BUILD files. Document minimum version requirements and perform regular security audits of dependencies.
Low · Potential Integer Overflow in Image Processing — guetzli/preprocess_downsample.cc, guetzli/output_image.cc, guetzli/dct_double.cc. Image dimension calculations in files like preprocess_downsample.cc, output_image.cc, and dct_double.cc may be vulnerable to integer overflow when processing very large image dimensions, potentially leading to heap corruption. Fix: Implement safe integer arithmetic checks for all image dimension calculations. Validate that width * height and related calculations do not overflow. Add maximum dimension limits.
Low · Missing Input Validation in Main Entry Point — guetzli/guetzli.cc, guetzli/processor.cc. The guetzli.cc main encoder file likely accepts file paths and image data without comprehensive validation visible in the file structure, potentially allowing path traversal or resource exhaustion attacks. Fix: Implement strict input validation for all file paths (reject relative paths, symlinks). Add resource limits for image dimensions and processing time. Validate all external input before processing.
Low · Lack of Security Documentation — Repository root. While CONTRIBUTING.md exists, there is no visible SECURITY.md file or documented security policies for reporting vulnerabilities in this active open-source project. Fix: Create a SECURITY.md file documenting the security policy, how to responsibly report vulnerabilities, and the project's approach to security updates.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

google/guetzli

Embed the "Healthy" badge

Onboarding doc

Onboarding: google/guetzli

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🛠️How to make changes

Implement a new Comparator (quality metric) backend

Add support for a new input image format (e.g., TIFF)

Optimize quantization table selection for specific image types

🔧Why these technologies

⚖️Trade-offs already made

🪤Traps & gotchas

🏗️Architecture

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add comprehensive unit tests for JPEG encoding pipeline

Add GitHub Actions CI workflow alongside Travis CI

Add integration tests for butteraugli perceptual quality metrics

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next