openzipkin/zipkin
Zipkin is a distributed tracing system
Healthy across the board
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 4w ago
- ✓11 active contributors
- ✓Distributed ownership (top contributor 42% of recent commits)
Show all 6 evidence items →Show less
- ✓Apache-2.0 licensed
- ✓CI configured
- ✓Tests present
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/openzipkin/zipkin)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/openzipkin/zipkin on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: openzipkin/zipkin
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/openzipkin/zipkin shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across the board
- Last commit 4w ago
- 11 active contributors
- Distributed ownership (top contributor 42% of recent commits)
- Apache-2.0 licensed
- CI configured
- Tests present
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live openzipkin/zipkin
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/openzipkin/zipkin.
What it runs against: a local clone of openzipkin/zipkin — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in openzipkin/zipkin | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 60 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of openzipkin/zipkin. If you don't
# have one yet, run these first:
#
# git clone https://github.com/openzipkin/zipkin.git
# cd zipkin
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of openzipkin/zipkin and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "openzipkin/zipkin(\\.git)?\\b" \\
&& ok "origin remote is openzipkin/zipkin" \\
|| miss "origin remote is not openzipkin/zipkin (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "pom.xml" \\
&& ok "pom.xml" \\
|| miss "missing critical file: pom.xml"
test -f "README.md" \\
&& ok "README.md" \\
|| miss "missing critical file: README.md"
test -f "benchmarks/pom.xml" \\
&& ok "benchmarks/pom.xml" \\
|| miss "missing critical file: benchmarks/pom.xml"
test -f ".github/workflows/test.yml" \\
&& ok ".github/workflows/test.yml" \\
|| miss "missing critical file: .github/workflows/test.yml"
test -f "RELEASE.md" \\
&& ok "RELEASE.md" \\
|| miss "missing critical file: RELEASE.md"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 60 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~30d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/openzipkin/zipkin"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
Zipkin is a distributed tracing system that collects, stores, and visualizes timing data across microservice architectures to diagnose latency issues. It provides a web UI for querying traces by service, operation, tags, and duration, plus a dependency graph showing request flows between services. The project includes a standalone server (zipkin-server), multiple storage backends (Cassandra, Elasticsearch, in-memory), and collectors for ingestion via HTTP, Kafka, gRPC, and other protocols. Multi-module Maven monorepo: zipkin-server/ contains the main server executable, benchmarks/ holds JMH performance tests, .github/workflows/ defines CI/CD automation. Core tracing model in zipkin2 package with codecs for JSON/Protobuf (benchmarks/src/test/java/zipkin2/codec/). Storage backends pluggable via collector modules, UI served as static assets from zipkin-server.
👥Who it's for
DevOps engineers and SREs operating microservice architectures who need visibility into distributed request flows and latency bottlenecks; platform teams instrumenting applications with tracing libraries; organizations requiring persistent trace storage and querying across services.
🌱Maturity & risk
Highly mature production system: v3.6.2 released, extensive CI/CD via GitHub Actions (test.yml, deploy.yml, docker_push.yml), comprehensive test coverage including benchmarks (benchmarks/ with JMH), Docker support, and Maven Central distribution. Actively maintained with structured release process (RELEASE.md), security guidelines (SECURITY.md), and contribution templates.
Low risk: stable API in java2 package, multi-year track record, but depends on multiple backend options (Cassandra, Elasticsearch) which add operational complexity. Monorepo structure with interdependent modules means breaking changes require coordination across zipkin-server, collectors, and storage implementations. Requires JRE 17+ which may constrain deployment in legacy environments.
Active areas of work
Active development at v3.6.2-SNAPSHOT: GitHub Actions workflows actively testing and deploying (test.yml, docker_push.yml). Security scanning enabled (security.yml). Maintenance includes Docker image publishing, Maven Central releases, and README examples kept current. Benchmarking infrastructure in place (SpanBenchmarks.java, CodecBenchmarks.java) for performance regression detection.
🚀Get running
Clone: git clone https://github.com/openzipkin/zipkin.git && cd zipkin. Build: ./mvnw clean install (Maven wrapper included in .mvn/). Run quick-start server: curl -sSL https://zipkin.io/quickstart.sh | bash -s && java -jar zipkin.jar or Docker: docker run -d -p 9411:9411 openzipkin/zipkin. Access UI at http://localhost:9411/zipkin.
Daily commands:
Dev: ./mvnw clean package -DskipTests builds all modules. Start server: java -jar zipkin-server/target/zipkin-server-*.jar. Run benchmarks: ./mvnw -pl benchmarks clean package exec:exec@jmh. Tests: ./mvnw clean verify.
🗺️Map of the codebase
pom.xml— Root Maven configuration defining the multi-module build structure, dependencies, and version management for the entire Zipkin distributed tracing systemREADME.md— Essential overview of Zipkin's purpose, features, and getting-started instructions that orients all contributors to the distributed tracing problem spacebenchmarks/pom.xml— JMH benchmark module configuration critical for understanding performance testing infrastructure and Java 17+ compilation requirements.github/workflows/test.yml— Continuous integration test pipeline that enforces code quality standards and validates all pull requests before mergeRELEASE.md— Release process and versioning strategy that contributors must follow for publishing artifacts to Maven Centraldocker/Dockerfile— Docker image configuration that packages Zipkin server, essential for deployment and local development workflows.github/CONTRIBUTING.md— Contributor guidelines covering code style, testing expectations, and submission process that every developer must read first
🛠️How to make changes
Add a new codec format (e.g., Avro or MessagePack)
- Create new codec implementation in benchmarks/src/test/java/zipkin2/codec/ following the pattern of MoshiSpanDecoder or ProtobufSpanDecoder (
benchmarks/src/test/java/zipkin2/codec/YourCodecDecoder.java) - Add benchmark class extending existing codec benchmark patterns to measure encoding/decoding throughput (
benchmarks/src/test/java/zipkin2/codec/YourCodecBenchmarks.java) - Update CodecBenchmarks.java to include your new codec in the multi-codec comparison (
benchmarks/src/test/java/zipkin2/codec/CodecBenchmarks.java) - Add interoperability test if format has compatibility concerns (similar to Proto3CodecInteropTest) (
benchmarks/src/test/java/zipkin2/internal/YourCodecInteropTest.java)
Add a new storage backend integration (e.g., DynamoDB, MongoDB)
- Create benchmark module in benchmarks/src/test/java/zipkin2/storage/ with BulkRequestBenchmarks pattern for your backend (
benchmarks/src/test/java/zipkin2/storage/YourStorageBenchmarks.java) - Implement batch write performance tests measuring throughput and latency p99 metrics (
benchmarks/src/test/java/zipkin2/storage/internal/BatchWriteBenchmarks.java) - Add integration test with docker-compose example similar to existing Elasticsearch/Cassandra examples (
docker/examples/docker-compose-yourbackend.yml)
Add a new performance optimization or feature
- Create corresponding JMH benchmark in benchmarks/src/test/java/zipkin2/ to measure performance impact (compare against SpanBenchmarks or ServerIntegratedBenchmark patterns) (
benchmarks/src/test/java/zipkin2/YourFeatureBenchmarks.java) - Update benchmarks/pom.xml if new dependencies are required for the benchmark (
benchmarks/pom.xml) - Add benchmark resource files (test data, configuration) to benchmarks/src/test/resources/ (
benchmarks/src/test/resources/your-feature-data.json) - Document the benchmark rationale in build-bin/README.md or create performance analysis comments (
build-bin/README.md)
Modify build or CI/CD pipeline
- Update test workflow which is the primary validation gate (
.github/workflows/test.yml) - Update root pom.xml for dependency or plugin changes affecting all modules (
pom.xml) - Update build-bin/test or build-bin/maven/maven_build if build steps change (
build-bin/test) - Update Docker build configuration if deployment artifacts change (
docker/Dockerfile)
🔧Why these technologies
- Maven + Java 17 — Standard JVM build tool with Java 17 LTS for modern language features while maintaining broad compatibility; enables multi-module architecture for codec, storage, and server variants
- JMH (Java Microbenchmark Harness) — Industry-standard framework for accurate performance measurement of serialization, throughput, and latency-sensitive operations in distributed tracing pipeline
- Protocol Buffers + JSON + binary codecs — Multiple serialization formats allow operators to trade off bandwidth (Proto3), human readability (JSON), and wire format compatibility across language ecosystems
- Docker + docker-compose — Containerized deployment model enables consistent packaging across environments and multi-service composition for testing storage backends (Elasticsearch, Cassandra, MySQL)
- GitHub Actions CI/CD — Native integration with repository enables automated testing, linting, security scanning, and Docker image publication on every commit
⚖️Trade-offs already made
-
Pluggable storage backend architecture (Elasticsearch vs Cassandra vs MySQL)
- Why: Different operators have different scale/consistency/operational requirements; allowing choice avoids forcing one backend on all users
- Consequence: Increased testing burden (must benchmark all backends), potential codec incompatibilities between backends, added complexity in storage abstraction layer
-
Multiple codec formats (Proto3, JSON, Moshi, Wire) with performance benchmarks
- Why: Operators need flexibility for bandwidth-constrained networks (Proto) vs debugging transparency (JSON) vs language interop (Wire)
- Consequence: Codec maintenance overhead, potential version skew bugs, need for interoperability testing (Proto3CodecInteropTest)
-
Separate benchmarks module rather than inline microbenchmarks
- Why: Isolates performance tests from main codebase, allows independent compilation, prevents benchmark overhead in production artifacts
- Consequence: Benchmark code lives separately from implementation; requires discipline to keep benchmarks in sync with API changes
-
Request throttling at server level (ThrottledCall pattern)
- Why: Protects storage backends from overwhelming load spikes; prevents cascade failures in distributed system
- Consequence: undefined
🪤Traps & gotchas
JRE 17+ hard requirement (not 11 or 8). Maven wrapper required—don't use system Maven. Benchmarks module skipped on -DskipTests to prevent deployment. Elasticsearch and Cassandra dependencies managed separately in dependencyManagement blocks to override transitive versions. Proto files unpacked to ${project.build.directory}/main/proto during build. Docker builds depend on .dockerignore for layer optimization.
🏗️Architecture
💡Concepts to learn
- Distributed Tracing — Fundamental concept Zipkin implements; understanding trace IDs, spans, and parent-child relationships is essential to using and extending this system
- Span Codecs (JSON, Protobuf, Wire) — Zipkin supports multiple serialization formats for spans; benchmarks/ heavily tests codec performance, critical for choosing ingestion format and storage efficiency
- JMH (Java Microbenchmarking Harness) — Entire benchmarks/ module uses JMH to measure codec and collector performance; understanding JMH output and methodology is required for performance work
- Pluggable Storage Backend Pattern — Zipkin's architecture separates span storage (Cassandra, Elasticsearch, in-memory) from collection logic; new storage backends follow this abstraction in collector modules
- Protocol Buffers (Proto3) — Zipkin uses protobuf for efficient span serialization; benchmarks/src/test/java/zipkin2/internal/Proto3CodecInteropTest.java validates proto interop
- Dependency Graph Analysis — Zipkin derives service dependency graphs from trace data to visualize architecture; core feature shown in web UI for identifying service interactions
🔗Related repos
jaegertracing/jaeger— Alternative CNCF distributed tracing system with similar goals but different architecture; Uber's production traceropen-telemetry/opentelemetry-java— Standardized instrumentation library that Zipkin integrates with; provides the client-side tracing SDKs that feed data to Zipkinopenzipkin/zipkin-go— Zipkin tracer implementation for Go services; required for instrumenting Go microservices in a Zipkin-based infrastructureopenzipkin/zipkin-js— JavaScript/Node.js tracer for browser and Node applications reporting to Zipkin; enables front-end tracingelastic/elasticsearch— One of two primary storage backends for Zipkin traces; most common choice for large-scale deployments
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive unit tests for benchmarks module (benchmarks/src/test/java/zipkin2)
The benchmarks directory contains JMH benchmark classes (EndpointBenchmarks.java, SpanBenchmarks.java, CodecBenchmarks.java, etc.) but lacks corresponding *Test.java files for most benchmarks. Only JacksonSpanDecoderTest.java and MoshiSpanDecoderTest.java exist. Adding unit tests would validate benchmark setup, ensure benchmark code correctness, and catch regressions early before they affect performance measurements. This is critical for a performance-sensitive distributed tracing system.
- [ ] Create EndpointBenchmarksTest.java to test endpoint object creation scenarios
- [ ] Create SpanBenchmarksTest.java to validate span serialization/deserialization paths
- [ ] Create ProtoCodecBenchmarksTest.java to test protobuf encoding/decoding
- [ ] Create MetricsBenchmarksTest.java for collector metrics benchmarks
- [ ] Create WriteBufferBenchmarksTest.java and ReadBufferBenchmarksTest.java for buffer operations
Implement GitHub Actions workflow for benchmark regression detection (build-bin/docker-compose config integration)
The repo has test.yml and deploy.yml workflows but no automated benchmark regression detection workflow. Given the benchmarks/ module with JMH setup and docker-compose files in build-bin/, adding a workflow that runs benchmarks on PRs and compares against baseline would prevent performance regressions from being merged. This is critical for maintaining the performance guarantees of a distributed tracing system.
- [ ] Create .github/workflows/benchmark.yml that runs
mvn clean verify -f benchmarks/pom.xmlwith JMH - [ ] Configure workflow to compare results against main branch using jmh-results-visualizer or similar
- [ ] Add artifact upload for benchmark results to enable historical tracking
- [ ] Document benchmark expectations and thresholds in benchmarks/README.md
Add integration tests for codec implementations (benchmarks/src/test/java/zipkin2/codec/)
The codec directory has multiple decoder implementations (JacksonSpanDecoder, MoshiSpanDecoder, ProtobufSpanDecoder, WireSpanDecoder) but only JacksonSpanDecoderTest.java and MoshiSpanDecoderTest.java have tests. ProtobufSpanDecoder and WireSpanDecoder lack dedicated test files. Given that codecs are critical for correctness in a tracing system, adding comprehensive tests for all decoders would ensure format compatibility, edge case handling, and prevent regressions.
- [ ] Create ProtobufSpanDecoderTest.java with tests for proto message deserialization
- [ ] Create WireSpanDecoderTest.java with tests for Wire format parsing
- [ ] Add tests for malformed input handling and error cases in all codec tests
- [ ] Create CodecBenchmarksTest.java to validate codec benchmark setup and fairness
- [ ] Add test data files (zipkin2-chinese.json, zipkin2-client.json already exist) for internationalization and complex scenarios
🌿Good first issues
- Add Java doc comments to benchmarks/src/test/java/zipkin2/server/ServerIntegratedBenchmark.java and related benchmark classes explaining what each benchmark measures and why it matters for performance tuning
- Create a test file
benchmarks/src/test/java/zipkin2/storage/StorageBackendBenchmarks.javabenchmarking query latency across different storage backends (Cassandra vs Elasticsearch) to establish baseline metrics - Expand CONTRIBUTING.md with a 'Testing' section documenting how to run specific benchmark suites, interpret JMH output, and add new benchmarks for performance-critical paths
⭐Top contributors
Click to expand
Top contributors
- @codefromthecrypt — 42 commits
- @zipkinci — 29 commits
- @reta — 20 commits
- @CodePrometheus — 2 commits
- @AlexKolpa — 1 commits
📝Recent commits
Click to expand
Recent commits
878ce2a— [maven-release-plugin] prepare for next development iteration (zipkinci)b2b0c98— [maven-release-plugin] prepare release 3.6.1 (zipkinci)d4687d7— Bump armeria to 1.38.0 (#3830) (AlexKolpa)e201278— Update Netty to 4.2.12.Final (#3831) (reta)5983bd9— Remove outdated BOM warning from dependencyManagement (#3826) (codefromthecrypt)fae3ec2— Add flatten-maven-plugin for Sonatype Central Portal (#3825) (codefromthecrypt)946e42e— Switch to recently released docker images (#3824) (codefromthecrypt)fef07f8— [maven-release-plugin] prepare for next development iteration (codefromthecrypt)1142cfa— [maven-release-plugin] prepare release 3.6.0 (zipkinci)702987c— Updates Zipkin to run on JRE 25, floor JDK 21 (#3821) (codefromthecrypt)
🔒Security observations
The Zipkin distributed tracing system demonstrates a reasonable security posture with automated security workflows (.github/workflows/security.yml) and documented security contact channels (zipkin-admin@googlegroups.com). However, there are minor concerns: the benchmarks POM file appears truncated or incomplete which could affect build integrity, the project relies on volunteer-based security maintenance without SLA guarantees, and dependency versions should be regularly reviewed. No hardcoded secrets, SQL injection risks, or obvious misconfiguration issues were identified in the visible files. The project follows good practices with proper licensing (Apache-2.0), dependency management via Maven BOM imports, and clear contribution guidelines. Recommendation: ensure POM file completeness and maintain active dependency vulnerability scanning.
- Medium · Incomplete Maven POM File —
benchmarks/pom.xml (dependencies section). The benchmarks/pom.xml file appears to be truncated or incomplete. The Wire dependency version is cut off mid-declaration, which could indicate parsing issues or incomplete dependency management that may lead to unpredictable behavior during builds. Fix: Complete the Maven POM file with all dependency versions explicitly specified. Ensure all dependencies have their full version declarations and are properly closed. - Low · Security Documentation Lacks Detail —
SECURITY.md. The SECURITY.md file indicates the project relies on volunteer community effort without a dedicated security team, SLA, or warranty. While transparent, this represents a potential risk for security-sensitive deployments. Fix: Consider establishing a formal security policy with response times, or clearly document the volunteer-based approach in all security documentation and deployment guides. - Low · JMH Dependency May Be Outdated —
benchmarks/pom.xml (jmh.version property). The benchmarks module uses JMH version 1.37, which was released in 2023. While not immediately critical, periodic updates to JMH and all dependencies should be performed to receive security patches. Fix: Regularly update JMH and all dependencies to their latest stable versions. Implement automated dependency scanning in the CI/CD pipeline (note: security.yml workflow exists, ensure it covers dependency vulnerabilities). - Low · Java 17 Target May Limit Security Updates —
benchmarks/pom.xml (maven.compiler.release: 17). The project targets Java 17 as the minimum version. While Java 17 is LTS, ensure it receives security updates for the deployment environment. Fix: Monitor Java 17 security advisories and plan upgrades to newer LTS versions (Java 21, etc.) as they become available and stable.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.