RepoPilotOpen in app →

apache/skywalking

APM, Application Performance Monitoring System

Healthy

Healthy across the board

weakest axis
Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit today
  • 13 active contributors
  • Apache-2.0 licensed
Show all 6 evidence items →
  • CI configured
  • Tests present
  • Concentrated ownership — top contributor handles 58% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Healthy
[![RepoPilot: Healthy](https://repopilot.app/api/badge/apache/skywalking)](https://repopilot.app/r/apache/skywalking)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/apache/skywalking on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: apache/skywalking

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/apache/skywalking shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across the board

  • Last commit today
  • 13 active contributors
  • Apache-2.0 licensed
  • CI configured
  • Tests present
  • ⚠ Concentrated ownership — top contributor handles 58% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live apache/skywalking repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/apache/skywalking.

What it runs against: a local clone of apache/skywalking — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in apache/skywalking | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 30 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>apache/skywalking</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of apache/skywalking. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/apache/skywalking.git
#   cd skywalking
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of apache/skywalking and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "apache/skywalking(\\.git)?\\b" \\
  && ok "origin remote is apache/skywalking" \\
  || miss "origin remote is not apache/skywalking (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/CommandDeserializer.java" \\
  && ok "apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/CommandDeserializer.java" \\
  || miss "missing critical file: apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/CommandDeserializer.java"
test -f "apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/ApplicationStartUp.java" \\
  && ok "apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/ApplicationStartUp.java" \\
  || miss "missing critical file: apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/ApplicationStartUp.java"
test -f "apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/OapProxyService.java" \\
  && ok "apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/OapProxyService.java" \\
  || miss "missing critical file: apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/OapProxyService.java"
test -f "apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/BaseCommand.java" \\
  && ok "apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/BaseCommand.java" \\
  || miss "missing critical file: apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/BaseCommand.java"
test -f "apm-webapp/src/main/resources/application.yml" \\
  && ok "apm-webapp/src/main/resources/application.yml" \\
  || miss "missing critical file: apm-webapp/src/main/resources/application.yml"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 30 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~0d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/apache/skywalking"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

Apache SkyWalking is an open-source APM (Application Performance Monitoring) system designed for distributed tracing, metrics collection, and observability in microservices and cloud-native architectures. It ingests telemetry data from agents (Java, .NET, PHP, Node.js, Go, Lua, Rust, Python) and third-party ecosystems (OpenTelemetry, Prometheus, Zipkin), stores it in BanyanDB (its native observability database), and provides topology analysis, dashboards, alerting, and AI-powered anomaly detection at scale (100+ billion telemetry data points per cluster). Multi-language monorepo structure: Java backend in root (oap-server, plugins, protocols), agent SDKs for each language in separate directories, apm-protocol/ defines gRPC schemas, apm-dist/ handles packaging, and .claude/skills/ contain reusable development workflows. Core backend uses plugin architecture with submodules for storage (BanyanDB integration), metrics aggregation (OTel pipelines), and receivers (OTel Collector, Prometheus, Zipkin protocols).

👥Who it's for

DevOps engineers, SREs, and platform teams running microservices on Kubernetes who need end-to-end distributed tracing, service dependency mapping, and performance diagnostics across polyglot applications; also backend developers building observability infrastructure who contribute agents and plugins.

🌱Maturity & risk

Production-ready with 23K+ GitHub stars and active Apache Foundation governance. The codebase shows enterprise maturity: comprehensive CI/CD workflows (GitHub Actions for CodeQL, E2E testing, Docker publishing), extensive test coverage via E2E frameworks, and organized monorepo structure. Commits appear recent and regular based on workflow definitions; this is an actively maintained flagship observability project.

Low risk for core users but high complexity: the Java monorepo is large (12M+ LOC) with many inter-dependent modules, and the ecosystem spans 10+ agent languages increasing maintenance burden. Skywalking depends on gRPC, Prometheus, and storage backends (BanyanDB or Elasticsearch), so operational complexity is substantial. Contribution requires understanding distributed systems, APM concepts, and multi-language agent integration.

Active areas of work

Active development indicated by: E2E testing workflows in .github/workflows/, AI/ML feature integration (mentioned in README), BanyanDB native database adoption, OpenTelemetry ecosystem maturation, and log management pipeline enhancements. The repo includes eBPF Rover agent early adoption for Kubernetes monitoring, and skill-based development tooling suggests ongoing toolchain modernization.

🚀Get running

Check README for instructions.

Daily commands:

# Build all modules
mvn clean package -DskipTests
# Run OAP server (backend)
java -jar oap-server/server-starter/target/server-*-bin.jar
# Deploy with Docker (see .github/workflows/publish-docker.yaml)
docker-compose up  # if a docker-compose.yml exists in project root

Refer to .claude/skills/compile/SKILL.md and .claude/skills/run-e2e/SKILL.md for specific build and test workflows.

🗺️Map of the codebase

  • apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/CommandDeserializer.java — Central deserialization hub for all agent commands (profiling, configuration, tracing); critical for agent-server communication protocol
  • apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/ApplicationStartUp.java — Web application entry point and bootstrap logic; initializes the UI server and OAP proxy services
  • apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/OapProxyService.java — Proxy layer connecting webapp to OAP backend; handles metric queries and trace aggregation requests
  • apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/BaseCommand.java — Abstract base for all command types (profiling, configuration); defines serialization contract for agent directives
  • apm-webapp/src/main/resources/application.yml — Core configuration for web application including OAP backend endpoints, storage, and profiling settings
  • .github/workflows/skywalking.yaml — Primary CI/CD pipeline; defines build, test, and packaging stages for all pull requests and releases
  • apm-dist/src/main/assembly/binary.xml — Distribution assembly descriptor; defines the final artifact structure and included modules for production deployment

🛠️How to make changes

Add a New Command Type (e.g., for custom agent directive)

  1. Create a new class extending BaseCommand in apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/ (apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/NewFeatureCommand.java)
  2. Implement serialize() and deserialize() methods following the BaseCommand interface (apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/BaseCommand.java)
  3. Register the new command type in CommandDeserializer's routing logic (add a case in the deserializer switch/if-else) (apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/CommandDeserializer.java)
  4. Add integration test for serialization/deserialization roundtrip (apm-protocol/apm-network/src/test/java/org/apache/skywalking/oap/server/network/trace/proto/GRPCNoServerTest.java)

Add a New UI API Endpoint (e.g., for custom metrics dashboard)

  1. Create a new REST controller in apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/ extending Spring's @RestController (apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/MetricsController.java)
  2. Inject OapProxyService to forward queries to the backend OAP server (apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/OapProxyService.java)
  3. Define the endpoint mapping and response DTOs; follow the proxy pattern (query → OAP → transform → respond) (apm-webapp/src/main/java/org/apache/skywalking/oap/server/webapp/MetricsController.java)
  4. Update application.yml if new backend paths or timeout settings are needed (apm-webapp/src/main/resources/application.yml)

Add Support for a New Protocol Constant or Enum Value

  1. Add the new enum value to ProfileConstants.java (or create a new constants class in apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/constants/) (apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/constants/ProfileConstants.java)
  2. Update any command classes that reference these constants (e.g., ProfileTaskCommand, ContinuousProfilingPolicyCommand) (apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/ProfileTaskCommand.java)
  3. Add test coverage for the new constant in GRPCNoServerTest (apm-protocol/apm-network/src/test/java/org/apache/skywalking/oap/server/network/trace/proto/GRPCNoServerTest.java)

Update Distribution Packaging (e.g., add new config files or scripts)

  1. Edit the binary.xml assembly descriptor to include new files/directories in the final distribution tarball (apm-dist/src/main/assembly/binary.xml)
  2. Update the webapp POM if new dependencies are required (apm-webapp/pom.xml)
  3. Ensure the new files are referenced in application.yml or other config files if needed (apm-webapp/src/main/resources/application.yml)
  4. Run the 'package' skill or Maven build to generate and test the distribution (Makefile)

🪤Traps & gotchas

Storage backend requirement: The OAP server requires a configured storage backend (BanyanDB, Elasticsearch, or PostgreSQL); running without one will fail silently. gRPC protocol coupling: Agent protocols are tightly coupled to oap-protocol definitions; Proto changes require re-generating client code in all agent SDKs. Configuration complexity: The backend uses YAML/environment-variable config; incorrect storage paths or receiver ports will cause startup failures with minimal error messages. Multi-protocol reconciliation: Metric names and data types must align across OTel, Prometheus, and native Skywalking formats to avoid aggregation mismatches. Test environment: E2E tests require Docker and network isolation; running locally on systems with port conflicts (6379, 9200, 12800) will hang.

🏗️Architecture

💡Concepts to learn

  • open-telemetry/opentelemetry-collector — OTel Collector is the de facto standard for telemetry collection; SkyWalking integrates with it as a receiver and exporter
  • prometheus/prometheus — Prometheus is SkyWalking's primary metrics ecosystem partner; SkyWalking consumes Prometheus scrape formats and exports metrics in OTel/Prometheus formats
  • apache/skywalking-banyandb — BanyanDB is SkyWalking's native observability database; the main storage backend for modern Skywalking deployments replacing Elasticsearch
  • jaegertracing/jaeger — Jaeger is an alternative distributed tracing system; SkyWalking implements Zipkin-compatible span ingestion, making it an interoperable counterpart
  • grafana/grafana — Grafana is the preferred visualization partner for SkyWalking; dashboards can query SkyWalking's backend via HTTP API or native Grafana datasource plugins

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for CommandDeserializer and command subclasses

The apm-protocol/apm-network module contains command deserialization logic (CommandDeserializer.java, AsyncProfilerTaskCommand.java, ContinuousProfilingPolicyCommand.java, etc.) that handles protocol messages from agents. These are critical paths for APM data ingestion, but there's no evidence of dedicated unit test coverage in the file structure. Adding thorough deserialization tests would catch protocol compatibility issues early and ensure robustness of the network communication layer.

  • [ ] Create apm-protocol/apm-network/src/test/java/org/apache/skywalking/oap/server/network/trace/component/command/CommandDeserializerTest.java
  • [ ] Add test cases for each command type (AsyncProfilerTaskCommand, ContinuousProfilingPolicyCommand, ConfigurationDiscoveryCommand, etc.) with valid and invalid payloads
  • [ ] Test edge cases like empty data, malformed bytes, and version mismatches
  • [ ] Add integration tests verifying deserialized commands produce expected object state

Create CI workflow for Protocol Compatibility Testing across agent versions

SkyWalking supports multiple agent implementations (Java, Python, Node.js, etc.) communicating via apm-protocol. The .github/workflows directory shows codeql, dead-link-checker, publish-docker workflows, but no dedicated protocol compatibility testing. This would ensure backward/forward compatibility when protocol changes occur, preventing silent agent-server communication failures.

  • [ ] Create .github/workflows/protocol-compatibility-check.yaml
  • [ ] Add matrix strategy testing apm-protocol against multiple agent SDK versions
  • [ ] Include steps to validate serialization/deserialization round-trips using apm-protocol/apm-network classes
  • [ ] Configure to run on protocol file changes (apm-protocol/apm-network/src/main/java/**)
  • [ ] Add reporting to flag breaking changes in pull requests

Implement integration tests for ProfileConstants and profiling command pipeline

The apm-protocol module defines ProfileConstants.java and multiple profiling-related commands (ContinuousProfilingPolicy, AsyncProfilerTaskCommand, ContinuousProfilingReportCommand), which form the profiling data ingestion pipeline. There's no visible test coverage for the end-to-end profiling command lifecycle. Adding these tests would ensure profiling data flows correctly from agents through the OAP server.

  • [ ] Create apm-protocol/apm-network/src/test/java/org/apache/skywalking/oap/server/network/trace/component/command/ProfilingCommandPipelineTest.java
  • [ ] Test ContinuousProfilingPolicy serialization and application to profiling commands
  • [ ] Add tests for ContinuousProfilingReportCommand deserialization with real profiling data samples
  • [ ] Verify ProfileConstants are correctly referenced throughout command implementations
  • [ ] Add tests validating profiling command state transitions (request → policy → report)

🌿Good first issues

  • Add missing Java documentation in oap-server/modules/core/src/main/java/ for metric aggregation pipeline — many aggregator classes lack Javadoc explaining the time-window bucketing logic: Improves onboarding for contributors; concrete, bounded task
  • Extend apm-checkstyle/checkStyle.xml to enforce javadoc for all public APIs; currently many public methods in storage plugins lack documentation: Prevents documentation debt and enforces standards at build time
  • Create example Lua metric extraction script in apm-server/modules/core/src/main/resources/lua-scripts/ for a common observability pattern (e.g., extracting HTTP status code distribution from OpenTelemetry spans): Closes gap between documentation and runnable examples; helps users leverage the script pipeline feature

Top contributors

Click to expand

📝Recent commits

Click to expand
  • 4890024 — SWIP-13: Live Debugger for MAL / LAL / OAL + admin-server + runtime-rule integration. (#13864) (wu-sheng)
  • 67160de — LAL: unify arithmetic operators (+ - * /) with JLS-style type promotion (#13858) (wu-sheng)
  • 2b745b2 — Layer extension: register custom layers without modifying OAP source (#13856) (wu-sheng)
  • dae1553 — Fix: LAL compiler treated (tag("x") as Integer) + (tag("y") as Integer) as string concatenation instead of numeric add (wankai123)
  • 3064b5f — Fix: remove the redundant tags from the envoy-ai-gateway.yaml LAL configuration. (#13854) (wankai123)
  • 7754e3e — Runtime rule hot-update for MAL and LAL (#13851) (wu-sheng)
  • 36a3f9c — Sync UI (#13848) (Fine0830)
  • a06f2e1 — MAL: add safeDiv(divisor) on SampleFamily that yields 0 when the divisor is 0. (#13846) (wankai123)
  • 2319def — Fix potential unexpected current directory inclusion in Docker OAP classpath (#13844) (weixiang1862)
  • 8242ff6 — Fix: remove the dependency from VirtualServiceAnalysisListener if GenAIAnalyzerModule is disabled. (#13838) (wankai123)

🔒Security observations

The Apache SkyWalking codebase demonstrates a reasonable security posture for a large distributed tracing system, but has notable concerns around deserialization security and gRPC endpoint protection. The main risks are: (1) Potential insecure deserialization in command processing which could be exploited if network input is compromised, (2) Possible missing authentication on gRPC endpoints based on test file patterns, and (3) Inability to verify dependency security without complete dependency tree. Strengths include proper use of Maven version management and an organized project structure. Recommendations focus on implementing strict input validation, enforcing authentication/TLS on network endpoints, and adding automated vulnerability scanning to the CI/CD pipeline.

  • Medium · Potential Insecure Deserialization in Command Classes — apm-protocol/apm-network/src/main/java/org/apache/skywalking/oap/server/network/trace/component/command/. The codebase contains multiple command deserialization classes (CommandDeserializer.java, various *Command.java files) that deserialize untrusted data from network sources. Without proper validation, this could lead to arbitrary code execution if an attacker can control the serialized data. Fix: Implement strict deserialization validation, use allow-lists for permitted classes, and consider using safer serialization formats. Add input validation for all command types before instantiation.
  • Medium · gRPC Server Exposure Without Authentication Context — apm-protocol/apm-network/src/test/java/org/apache/skywalking/oap/server/network/trace/proto/GRPCNoServerTest.java. The test file GRPCNoServerTest.java suggests gRPC endpoints may be exposed without comprehensive authentication mechanisms. The naming pattern indicates testing of scenarios without server validation, which could indicate missing authentication in main codebase. Fix: Implement mutual TLS (mTLS) for gRPC communication, enforce authentication and authorization policies on all gRPC endpoints, and validate all incoming requests.
  • Low · Unclear Dependency Management in POM Files — apm-dist/pom.xml, apm-protocol/pom.xml. The pom.xml file uses a variable ${revision} for versioning, which is good practice, but without access to the parent POM and complete dependency tree, potential vulnerable transitive dependencies cannot be fully assessed. Fix: Run 'mvn dependency:tree' and use OWASP Dependency-Check plugin to identify known vulnerable dependencies. Implement automated dependency scanning in CI/CD pipeline.
  • Low · Missing Security Headers Configuration — apm-webapp/. No evidence of security headers configuration (CSP, X-Frame-Options, etc.) in the visible configuration files for the webapp component (apm-webapp). Fix: Configure HTTP security headers in the web application server configuration. Implement Content Security Policy (CSP), X-Frame-Options, X-Content-Type-Options, and other standard security headers.
  • Low · Profile-Based Configuration May Create Hidden Attack Surfaces — apm-dist/pom.xml (and parent pom.xml). The Maven profile structure with 'backend' as default profile could lead to inconsistent security configurations across different build profiles if not carefully managed. Fix: Document all Maven profiles and ensure security configurations are consistent across profiles. Audit each profile's dependency management and plugin configurations.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Healthy signals · apache/skywalking — RepoPilot