JakeWharton/DiskLruCache
Java implementation of a Disk-based LRU cache which specifically targets Android compatibility.
Healthy across all four use cases
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓10 active contributors
- ✓Apache-2.0 licensed
- ✓CI configured
Show all 6 evidence items →Show less
- ✓Tests present
- ⚠Stale — last commit 6y ago
- ⚠Concentrated ownership — top contributor handles 68% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/jakewharton/disklrucache)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/jakewharton/disklrucache on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: JakeWharton/DiskLruCache
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/JakeWharton/DiskLruCache shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- 10 active contributors
- Apache-2.0 licensed
- CI configured
- Tests present
- ⚠ Stale — last commit 6y ago
- ⚠ Concentrated ownership — top contributor handles 68% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live JakeWharton/DiskLruCache
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/JakeWharton/DiskLruCache.
What it runs against: a local clone of JakeWharton/DiskLruCache — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in JakeWharton/DiskLruCache | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 2288 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of JakeWharton/DiskLruCache. If you don't
# have one yet, run these first:
#
# git clone https://github.com/JakeWharton/DiskLruCache.git
# cd DiskLruCache
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of JakeWharton/DiskLruCache and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "JakeWharton/DiskLruCache(\\.git)?\\b" \\
&& ok "origin remote is JakeWharton/DiskLruCache" \\
|| miss "origin remote is not JakeWharton/DiskLruCache (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "src/main/java/com/jakewharton/disklrucache/DiskLruCache.java" \\
&& ok "src/main/java/com/jakewharton/disklrucache/DiskLruCache.java" \\
|| miss "missing critical file: src/main/java/com/jakewharton/disklrucache/DiskLruCache.java"
test -f "src/main/java/com/jakewharton/disklrucache/StrictLineReader.java" \\
&& ok "src/main/java/com/jakewharton/disklrucache/StrictLineReader.java" \\
|| miss "missing critical file: src/main/java/com/jakewharton/disklrucache/StrictLineReader.java"
test -f "src/main/java/com/jakewharton/disklrucache/Util.java" \\
&& ok "src/main/java/com/jakewharton/disklrucache/Util.java" \\
|| miss "missing critical file: src/main/java/com/jakewharton/disklrucache/Util.java"
test -f "pom.xml" \\
&& ok "pom.xml" \\
|| miss "missing critical file: pom.xml"
test -f "README.md" \\
&& ok "README.md" \\
|| miss "missing critical file: README.md"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 2288 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~2258d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/JakeWharton/DiskLruCache"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
DiskLruCache is a Java implementation of a Least Recently Used (LRU) cache that persists data to the filesystem rather than memory, designed specifically for Android compatibility. It manages a bounded amount of disk space by evicting the least-recently-used entries when the size limit is exceeded, allowing apps to cache large byte sequences (like images or downloaded data) with a fixed key-value structure where each key maps to multiple independent values. Single-library structure: src/main/java/com/jakewharton/disklrucache/ contains three core files (DiskLruCache.java, StrictLineReader.java, Util.java) implementing the cache engine and file I/O, with mirror test files in src/test/java. Build and deployment scripts (.buildscript/) manage Maven publishing to Sonatype OSS repository.
👥Who it's for
Android developers and Java applications that need persistent local caching without loading everything into memory. Specifically useful for apps downloading media, managing HTTP response caches, or storing temporary files with automatic eviction when disk space limits are reached.
🌱Maturity & risk
Production-ready and mature: version 2.0.2 is released with Apache 2.0 licensing, includes unit tests (DiskLruCacheTest.java, StrictLineReaderTest.java), uses Maven CI/CD (.travis.yml), and targets Java 1.5+ for Android backward compatibility. The codebase is stable with no indication of active development, suggesting it has reached maintenance mode.
Low risk for core functionality, but single-maintainer risk (Jake Wharton) with minimal recent activity visible. No external dependencies in the core library (zero third-party runtime deps), reducing supply-chain risk. Primary concern is the strict constraint that multiple processes cannot share the same cache directory simultaneously — multi-process Android apps must architect around this.
Active areas of work
No active development visible; this is a stable, maintained library. Latest version in pom.xml is 2.0.3-SNAPSHOT, suggesting no recent releases. Suitable for existing projects but unlikely to gain new features.
🚀Get running
git clone https://github.com/JakeWharton/DiskLruCache.git
cd DiskLruCache
mvn clean verify
The compiled JAR will appear in the target/ directory. For usage in your project, add the Maven dependency from the README or download the .jar directly.
Daily commands:
This is a library, not a runnable application. Run tests via: mvn test. Build the JAR via: mvn clean verify. Deploy snapshots via .buildscript/deploy_snapshot.sh (requires Sonatype credentials in .buildscript/settings.xml).
🗺️Map of the codebase
src/main/java/com/jakewharton/disklrucache/DiskLruCache.java— Core cache implementation—entry point for all cache operations (get, edit, remove, eviction); every contributor must understand the LRU eviction logic and journal formatsrc/main/java/com/jakewharton/disklrucache/StrictLineReader.java— Parsing utility for reading the cache journal file; critical for correctly deserializing cache state from disksrc/main/java/com/jakewharton/disklrucache/Util.java— Shared utilities (MD5 hashing, stream copying) used throughout the cache—foundational for key validation and I/O operationspom.xml— Maven build configuration specifying JDK target, dependencies, and deployment—necessary to understand build environment and compatibility requirementsREADME.md— Documents cache contract (key format, value constraints, concurrency model, soft limit behavior) essential before modifying core logicsrc/test/java/com/jakewharton/disklrucache/DiskLruCacheTest.java— Comprehensive test suite covering eviction, journaling, concurrency, and error conditions—reference for expected behavior and regression prevention
🛠️How to make changes
Add Custom Eviction Policy
- Review the current LRU eviction mechanism in
DiskLruCache.javaaround thetrimToSize()andexecutorService.execute()methods (src/main/java/com/jakewharton/disklrucache/DiskLruCache.java) - Modify the
lruEntriesLinkedHashMap ordering or overrideremoveEldestEntry()behavior to implement alternative eviction (e.g., LFU, TTL-based) (src/main/java/com/jakewharton/disklrucache/DiskLruCache.java) - Update journal format in
readJournal()if storing new metadata (e.g., access count, timestamp) needed for custom eviction (src/main/java/com/jakewharton/disklrucache/DiskLruCache.java) - Add test cases in
DiskLruCacheTest.javato verify new eviction order under various load scenarios (src/test/java/com/jakewharton/disklrucache/DiskLruCacheTest.java)
Add Encryption or Compression Layer
- Create new utility methods in
Util.javaor a newCipher.javaclass for en/decryption and compression (src/main/java/com/jakewharton/disklrucache/Util.java) - Modify
Editor.newOutputStream()andSnapshot.getInputStream()inDiskLruCache.javato wrap streams with cipher/compression filters (src/main/java/com/jakewharton/disklrucache/DiskLruCache.java) - Update journal header or metadata if versioning is needed to track encryption scheme (
src/main/java/com/jakewharton/disklrucache/DiskLruCache.java) - Add integration tests validating round-trip en/decryption and backward compatibility (
src/test/java/com/jakewharton/disklrucache/DiskLruCacheTest.java)
Handle Non-Exclusive Cache Directory (Multi-Process Safe)
- Add file locking mechanism (java.nio.channels.FileLock) in the cache initialization within
DiskLruCache.open()(src/main/java/com/jakewharton/disklrucache/DiskLruCache.java) - Enhance
readJournal()to detect and recover from stale locks or corrupted journal entries left by crashed processes (src/main/java/com/jakewharton/disklrucache/DiskLruCache.java) - Optionally update
Util.javawith inter-process communication helpers (shared memory, named pipes) if coordination is needed (src/main/java/com/jakewharton/disklrucache/Util.java) - Add stress tests simulating concurrent multi-process access in
DiskLruCacheTest.java(src/test/java/com/jakewharton/disklrucache/DiskLruCacheTest.java)
🔧Why these technologies
- Java NIO FileChannel & RandomAccessFile — Efficient, portable disk I/O on all platforms (Android, server JVM); required for Android API compatibility
- LinkedHashMap with eviction order — O(1) LRU tracking and eviction without separate data structures; built-in insertion/access order guarantees
- Single-threaded journal (non-WAL) — Simplifies consistency; journal entries are idempotent, enabling safe recovery after crashes
- ExecutorService for background eviction — Non-blocking size enforcement; allows trimming to proceed asynchronously without blocking client threads
- MD5 hashing for key validation — Fast, deterministic conversion of arbitrary keys to safe filesystem names; not cryptographic (no security requirement)
⚖️Trade-offs already made
-
Soft size limit (not strict enforcement)
- Why: Allows cache to temporarily exceed limit while waiting for background eviction to catch up
- Consequence: Client should set conservative limits; actual disk usage may spike 10–30% above target until eviction completes
-
Single exclusive cache directory (no multi-process sharing)
- Why: Avoids locking overhead and race conditions on journal writes; simplifies consistency model
- Consequence: Cannot safely share cache between multiple processes; requires OS-level isolation (separate directories per process)
-
Synchronous journal writes on edit/remove
- Why: Ensures durability guarantees; entry is only visible after journal update
- Consequence: Write latency is O(journal size); frequent edits may bottleneck on I/O; background rebuild mitigates on reopens
-
Fixed number of values per key (not flexible)
- Why: Simplifies naming scheme and index recovery; matches common caching patterns (thumbnail, metadata, etc.)
- Consequence: Cannot store variable-length value arrays; must define max values upfront during cache creation
🚫Non-goals (don't propose these)
- Does not handle authentication or encryption (delegated to application layer)
- Does not provide real-time consistency across multiple processes (single-process only)
- Does not implement distributed or network-based caching (local filesystem only)
- Does not offer in-memory acceleration (strictly disk-backed; data always read from storage)
- Does not provide transaction semantics or ACID guarantees beyond crash recovery
🪤Traps & gotchas
The cache directory must be exclusive to a single process — no multi-process concurrent access. The key regex [a-z0-9_-]{1,120} is strictly enforced; invalid keys will throw IllegalArgumentException. The cache journal is not a strict size limit — temporary overage can occur before background eviction runs. File I/O errors silently fail during writes (edit will fail without exception), so callers must verify commit success. Java 1.5 target means no modern Java features (enums, generics used minimally); watch for compatibility when modifying.
🏗️Architecture
💡Concepts to learn
- LRU (Least Recently Used) Eviction — Core algorithm of this library — entries are removed based on access recency to stay within size limits; understanding LRU is essential to grasp why edit/get operations modify internal access tracking
- Write-Ahead Logging (Journal) — DiskLruCache uses a journal file to track all cache operations atomically; understanding WAL is critical to understanding crash recovery and why the journal must be fsync'd
- Atomic Transactions — Editor.commit() is atomic — readers see either the old or new entry values, never a mix; this requires understanding transaction semantics and temporary file swaps
- Memory-Mapped I/O vs. Stream I/O — DiskLruCache uses stream-based I/O (not memory-mapped) for Android compatibility; understanding the tradeoff between random-access performance and portability informs modifications
- Hash-based Key Normalization — DiskLruCache hashes user keys to MD5 for filesystem storage; understanding why keys are normalized (avoiding filesystem special characters, length limits) is essential when debugging key lookup failures
- Snapshot Isolation — A get() call creates a snapshot that doesn't see concurrent updates/removals; this requires reference counting and understanding how readers must close() snapshots to release locks
- Exclusive Locking — Only one Editor per key is allowed at a time; edit() returns null if the key is already being edited; understanding why prevents deadlocks and lost updates in single-threaded Android apps
🔗Related repos
square/okhttp— OkHttp HTTP client uses DiskLruCache (or similar disk cache) for HTTP response caching; understanding the relationship helps optimize cache eviction with HTTP semanticsgoogle/guava— Guava's CacheBuilder provides in-memory LRU caching with similar eviction semantics; understanding both in-memory and disk-based approaches informs cache architecture choicesJakeWharton/butterknife— Another foundational Android library by the same author; useful for understanding Jake's coding style and Android compatibility patterns used across his projectsandroid/platform_frameworks_base— The original Android source where parts of DiskLruCache were derived from; reading the original context helps understand design decisions and Android-specific constraintsrealm/realm-java— Alternative disk-based persistence solution for Android; comparing approaches helps understand when DiskLruCache is the right choice vs. a database
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive edge-case tests for DiskLruCache.java
DiskLruCacheTest.java exists but lacks coverage for critical edge cases in a production disk cache: concurrent access patterns, corrupted journal recovery, filesystem I/O failures, and boundary conditions (zero-length values, max key size violations, disk full scenarios). These are high-risk areas in a cache implementation that could cause silent data loss or crashes on Android devices.
- [ ] Add test cases in src/test/java/com/jakewharton/disklrucache/DiskLruCacheTest.java for concurrent edit/read operations
- [ ] Add test cases for corrupted journal file recovery and repair logic
- [ ] Add test cases for IOException handling during write operations (simulating disk full)
- [ ] Add boundary validation tests (key regex compliance, Integer.MAX_VALUE value sizes, empty cache)
- [ ] Verify test coverage reaches >90% for DiskLruCache.java using a coverage tool
Add StrictLineReaderTest.java test coverage for line reading edge cases
StrictLineReader.java is a critical utility for parsing the cache journal, but StrictLineReaderTest.java likely has incomplete coverage. This class must handle various line ending formats (\r\n, \n, \r), empty lines, extremely long lines, and EOF conditions without crashing or causing infinite loops. Journal parsing failures can corrupt the entire cache.
- [ ] Add test cases in src/test/java/com/jakewharton/disklrucache/StrictLineReaderTest.java for mixed line endings (\r\n, \n, \r only)
- [ ] Add test cases for lines at/exceeding buffer boundaries
- [ ] Add test cases for EOF without newline and empty reader streams
- [ ] Add test cases for very long lines (multi-KB) to verify no stack overflow
- [ ] Add test cases for charset handling edge cases if applicable
Migrate build system from Maven to Gradle with Android-specific tooling
The repo targets Android compatibility (per description) but uses Maven (pom.xml) with Java 1.5 as baseline. Android tooling has evolved significantly; contributors and maintainers expect Gradle with modern Android plugin support. This enables easier testing on actual Android devices/emulators and aligns with current Android development practices. Current .travis.yml indicates CI is in place but Gradle would modernize the build.
- [ ] Create build.gradle file with Android-compatible dependencies (JUnit 4.10, commons-io 2.1, Fest 2.0M10)
- [ ] Add gradle/wrapper/ directory with gradlew scripts for reproducible builds
- [ ] Create settings.gradle for project structure
- [ ] Update .travis.yml to use ./gradlew instead of maven commands
- [ ] Verify Maven pom.xml can be removed after successful Gradle migration
🌿Good first issues
- Add test coverage for edge case where cache is corrupted mid-journal write and recovery is needed; currently DiskLruCacheTest.java tests some corruption but not all recovery paths.
- Document the exact binary format of the cache journal file in code comments (magic number, version, entry format); currently only specified in README and requires reverse-engineering from DiskLruCache.java
- Create a StrictLineReader performance test comparing to BufferedReader on various line lengths and file sizes to validate the custom implementation choice for Android.
⭐Top contributors
Click to expand
Top contributors
- @JakeWharton — 68 commits
- @swankjesse — 17 commits
- [@Marcelo Cortes](https://github.com/Marcelo Cortes) — 3 commits
- @jonasfa — 3 commits
- @Wavesonics — 2 commits
📝Recent commits
Click to expand
Recent commits
3e01635— Merge pull request #88 from divankov/patch-2 (JakeWharton)057db12— Update keys length in regex in README.md (divankov)3aa6286— Merge pull request #77 from JakeWharton/jw/auto-deploy (JakeWharton)5f0ef1e— Auto-deploy snapshots from Travis CI to Sonatype. (JakeWharton)7a1ecbd— Merge pull request #74 from JakeWharton/jwilson_1023_rebuild (JakeWharton)5b45d10— Don't append to a truncated line in the journal. (swankjesse)c2965c0— Merge pull request #69 from JakeWharton/jw/fix-test-non-determinism (JakeWharton)09d1140— Correct non-deterministic test by querying the queue directly. (JakeWharton)dc9e37f— Merge branch 'acherkashyn-master' (JakeWharton)78b3f9f— Increase key length limit to 120. (acherkashyn)
🔒Security observations
The DiskLruCache project has a generally acceptable security posture, but suffers from outdated dependencies dating back to 2011-2012. The primary concerns are: (1) outdated test and utility dependencies (JUnit 4.10, Commons IO 2.1, FEST 2.0M10) that lack modern security patches, (2) Java 1.5 target version which is intentional for Android compatibility but limits modern security features, and (3) potential exposure of build credentials in version control. The core library itself appears to have no obvious injection vulnerabilities, hardcoded secrets, or critical misconfigurations. Recommendation: Update all dependencies to current versions, migrate to Java 7+ for new deployments where possible, and audit the build credential management.
- Medium · Outdated Dependency: JUnit 4.10 —
pom.xml - junit.version=4.10. The project uses JUnit 4.10 (released 2012), which is significantly outdated. Modern versions contain security patches and bug fixes. While JUnit is primarily a test dependency, vulnerabilities in test frameworks can still pose risks in development environments. Fix: Update to the latest stable JUnit 4.x version (currently 4.13.2 or later). Consider migrating to JUnit 5 for better security and feature support. - Medium · Outdated Dependency: Commons IO 2.1 —
pom.xml - commons-io.version=2.1. Commons IO 2.1 is from 2011 and predates many security improvements. This library handles file I/O operations and has had security-related updates in later versions. Given that DiskLruCache is file-system based, this is particularly relevant. Fix: Update to Commons IO 2.11.0 or later, which includes important security and reliability fixes for file operations. - Low · Outdated Dependency: Fest Assertions 2.0M10 —
pom.xml - fest.version=2.0M10. FEST Assertions 2.0M10 is a milestone release from 2012. The project has since been superseded by AssertJ. Using outdated assertion libraries in test dependencies has minimal security impact but indicates aging codebase maintenance. Fix: Migrate to AssertJ (the successor to FEST) for better maintenance, security patches, and improved functionality. - Low · Java 1.5 Target Version —
pom.xml - java.version=1.5. The project targets Java 1.5 (released 2004), which is extremely outdated and lacks modern security features, cryptographic improvements, and performance optimizations. While this is intentional for Android compatibility, it limits security capabilities. Fix: For new deployments, target Java 7 or higher. If Android compatibility is required, document this explicitly and consider conditional compilation or separate builds for modern platforms. - Low · Incomplete POM XML —
pom.xml - line with sourceEncoding property. The provided pom.xml appears truncated with an unclosed tag:<project.build.sourceEncoding>UTF-8<(missing closing>). This could indicate file corruption or parsing issues. Fix: Verify the pom.xml file is properly formed and complete. Ensure all XML tags are properly closed. - Low · Credentials in Build Scripts —
.buildscript/settings.xml, .buildscript/deploy_snapshot.sh. The presence of.buildscript/settings.xmlanddeploy_snapshot.shsuggests Maven deployment credentials may be stored. If these files are checked into version control, credentials could be exposed. Fix: Ensure.buildscript/settings.xmlis in.gitignoreand never committed. Use environment variables or Maven's~/.m2/settings.xmlfor credentials. Verify current repository history for any exposed credentials.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.