Netflix/SimianArmy
Tools for keeping your cloud operating in top form. Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.
Healthy across all four use cases
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓16 active contributors
- ✓Distributed ownership (top contributor 32% of recent commits)
- ✓Apache-2.0 licensed
Show all 6 evidence items →Show less
- ✓CI configured
- ✓Tests present
- ⚠Stale — last commit 7y ago
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/netflix/simianarmy)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/netflix/simianarmy on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: Netflix/SimianArmy
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/Netflix/SimianArmy shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- 16 active contributors
- Distributed ownership (top contributor 32% of recent commits)
- Apache-2.0 licensed
- CI configured
- Tests present
- ⚠ Stale — last commit 7y ago
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live Netflix/SimianArmy
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/Netflix/SimianArmy.
What it runs against: a local clone of Netflix/SimianArmy — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in Netflix/SimianArmy | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | Last commit ≤ 2728 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of Netflix/SimianArmy. If you don't
# have one yet, run these first:
#
# git clone https://github.com/Netflix/SimianArmy.git
# cd SimianArmy
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of Netflix/SimianArmy and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "Netflix/SimianArmy(\\.git)?\\b" \\
&& ok "origin remote is Netflix/SimianArmy" \\
|| miss "origin remote is not Netflix/SimianArmy (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 2728 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~2698d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/Netflix/SimianArmy"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
SimianArmy is a suite of cloud resilience and compliance tools centered on Chaos Monkey, which injects random instance failures into production AWS environments to test application tolerance. It also includes Janitor Monkey for automated resource cleanup and Conformity Monkey for compliance checking, all deployed as a Java WAR application against EC2, RDS, and SimpleDB. Monolithic Java WAR deployed via Jetty with core domain logic in src/main/java/com/netflix/simianarmy/ split into pluggable monkey implementations (ChaosMonkey, JanitorMonkey, ConformityMonkey). AWS integration via src/main/java/com/netflix/simianarmy/aws/ using jclouds for EC2/RDS and SimpleDB/RDS for persistence. Gradle build with nebula.netflixoss plugin.
👥Who it's for
AWS infrastructure operators and platform engineers who need to continuously validate that their production applications survive random failures, and infrastructure teams who want automated cleanup of orphaned cloud resources. Also relevant to compliance teams needing automated conformity monitoring across cloud deployments.
🌱Maturity & risk
RETIRED AND UNMAINTAINED as of 2016. The project is no longer actively developed—Netflix has moved Chaos Monkey to a standalone service (netflix/chaosmonkey), Janitor functionality to Swabbie, and Conformity work into Spinnaker. This is a historical/reference codebase, not suitable for new production deployments.
High risk for production use: project is explicitly retired with no active maintenance. Dependencies are severely outdated (AWS SDK 1.11.28 from 2016, Jackson 1.9.2, Guava 11.0.2). No modern security patches, no active CI/CD beyond Travis, and no path to upgrades. Use only as reference architecture or migrate to netflix/chaosmonkey and spinnaker/swabbie.
Active areas of work
Nothing. The project is retired. Last meaningful commits were in 2016. All active development has moved to standalone services: netflix/chaosmonkey for chaos injection, spinnaker/swabbie for resource cleanup, and Spinnaker backend services for conformity.
🚀Get running
Clone: git clone https://github.com/Netflix/SimianArmy.git && cd SimianArmy. Build: ./gradlew build (requires Java 8). Deploy as WAR via gradle assemble outputs build/libs/simianarmy-*.war for deployment to app server. But do not use in production—refer to netflix/chaosmonkey instead.
Daily commands:
Dev: ./gradlew jettyRun starts embedded Jetty on port 8080. WAR deployment: ./gradlew assemble builds WAR at build/libs/simianarmy-*.war for deployment to external servlet container. Configuration via simianarmy.properties file (not in repo—template must be created). Note: requires AWS credentials and active AWS account for actual chaos testing.
🗺️Map of the codebase
- src/main/java/com/netflix/simianarmy/Monkey.java: Core interface all monkey implementations extend; defines the contract for periodic chaos/compliance tools
- src/main/java/com/netflix/simianarmy/MonkeyRunner.java: Bootstrap and orchestration logic that loads and schedules all registered Monkey implementations at startup
- src/main/java/com/netflix/simianarmy/CloudClient.java: Abstract interface for cloud provider interactions; main extension point for multi-cloud support
- src/main/java/com/netflix/simianarmy/aws/AWSResource.java: AWS-specific Resource implementation; wraps EC2 instances, RDS clusters, and other AWS resources for manipulation
- src/main/java/com/netflix/simianarmy/MonkeyRecorder.java: Event persistence abstraction; implementations in SimpleDBRecorder.java and RDSRecorder.java log all chaos events and compliance violations
- build.gradle: Gradle build definition with all dependencies pinned to 2016-era versions; modifying requires understanding WAR plugin and servlet deployment
- src/main/java/com/netflix/simianarmy/MonkeyConfiguration.java: Configuration loading interface; implementations read simianarmy.properties and override behavior per Monkey type
- src/main/java/com/netflix/simianarmy/aws/conformity/: ConformityMonkey implementation tree; contains crawlers and trackers for AWS resource compliance rules
🛠️How to make changes
New Monkey type: extend com.netflix.simianarmy.Monkey interface in src/main/java/com/netflix/simianarmy/, register in MonkeyRunner. AWS integration: add methods to src/main/java/com/netflix/simianarmy/aws/AWSResource.java and CloudClient.java. Email alerts: extend AbstractEmailBuilder.java in src/main/java/com/netflix/simianarmy/. Persistence: implement MonkeyRecorder interface (see SimpleDBRecorder.java, RDSRecorder.java). Configuration: edit gradle.properties for build-time settings and create simianarmy.properties for runtime.
🪤Traps & gotchas
- Credentials & AWS permissions: Requires valid AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and EC2/RDS permissions to run; will silently fail to list resources without them. 2) SimpleDB deprecation: Uses AWS SimpleDB (now deprecated) for event storage; must switch to RDS recorder in production or implement custom persister. 3) Servlet container required: Gradle jettyRun works for dev, but production requires deploying WAR to external servlet container (Tomcat, Jetty) with CATALINA_OPTS/JAVA_OPTS for JVM tuning. 4) Property file not in repo: Must create simianarmy.properties in classpath with monkey-specific configs (enabled=true, frequency, email addresses, etc.); missing file causes silent disabling of monkeys. 5) Eureka integration: If Eureka client enabled, requires running Eureka server or application will fail discovery; disable in config if not using Netflix infrastructure.
💡Concepts to learn
- Chaos Engineering — SimianArmy's core philosophy—proactively injecting failures into production to validate resilience rather than discovering failures reactively; this repo is a seminal reference implementation
- Circuit Breaker Pattern — Applications must implement circuit breakers to tolerate the random instance terminations Chaos Monkey injects; understanding this pattern is essential to using SimianArmy effectively
- AWS SimpleDB & Event Sourcing — SimianArmy uses SimpleDB as immutable event store for all chaos events; understanding this (now-deprecated) service and event sourcing patterns is critical to extending MonkeyRecorder
- Pluggable Architecture via Interface-based Design — SimianArmy uses Java interfaces (Monkey, CloudClient, MonkeyRecorder, MonkeyConfiguration, EmailBuilder) to enable swappable implementations; understanding this pattern is essential to adding new monkeys or cloud providers
- AWS ASG (Auto Scaling Group) Membership — Chaos Monkey targets instances via ASG membership, not individual instance IDs, ensuring replacement; understanding ASG lifecycle is essential to understanding which instances get terminated
- Service Discovery via Eureka — SimianArmy integrates Eureka client (eureka-client:1.4.1) for service registration and discovery in Netflix infrastructure; optional but critical for multi-region deployments
🔗Related repos
netflix/chaosmonkey— Official successor to SimianArmy's Chaos Monkey; standalone service that supersedes this codebase for chaos engineeringspinnaker/swabbie— Successor to Janitor Monkey functionality; standalone service for automated cloud resource cleanup and compliancespinnaker/spinnaker— Netflix's continuous delivery platform; inherits ConformityMonkey compliance checking as backend servicesgremlin/gremlin-python— Modern alternative chaos engineering framework; provides programmatic API for multi-cloud failure injection unlike SimianArmy's AWS-centric designpowerfulseal/powerfulseal— Cloud-native chaos engineering for Kubernetes; relevant if migrating workloads from EC2 to k8s and need equivalent failure testing
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add unit tests for AWS Janitor implementations (ASGJanitor, EBSSnapshotJanitor, ELBJanitor, ImageJanitor)
The janitor modules in src/main/java/com/netflix/simianarmy/aws/janitor/ lack corresponding test files. These are critical components that delete cloud resources, so comprehensive unit tests are essential for preventing accidental resource deletion bugs. This is especially important given the project's retirement status—better test coverage ensures existing deployments remain stable.
- [ ] Create src/test/java/com/netflix/simianarmy/aws/janitor/ASGJanitorTest.java with tests for cleanup logic and tag-based filtering
- [ ] Create src/test/java/com/netflix/simianarmy/aws/janitor/EBSSnapshotJanitorTest.java covering snapshot age validation and deletion rules
- [ ] Create src/test/java/com/netflix/simianarmy/aws/janitor/ELBJanitorTest.java testing load balancer cleanup conditions
- [ ] Create src/test/java/com/netflix/simianarmy/aws/janitor/ImageJanitorTest.java for AMI cleanup scenarios
- [ ] Use Mockito (already a test dependency) to mock AWS SDK calls and verify correct API invocations
Add unit tests for Conformity Monkey rules (CrossZoneLoadBalancing, InstanceHasHealthCheckUrl, InstanceInVPC, InstanceTooOld, etc.)
The conformity rule implementations in src/main/java/com/netflix/simianarmy/aws/conformity/rule/ have no corresponding tests. These rules define cloud infrastructure compliance checks, and without tests, rule behavior changes or regressions go undetected. This is particularly important since conformity checking directly affects deployment validation.
- [ ] Create src/test/java/com/netflix/simianarmy/aws/conformity/rule/CrossZoneLoadBalancingTest.java to verify ELB cross-zone detection
- [ ] Create src/test/java/com/netflix/simianarmy/aws/conformity/rule/InstanceInVPCTest.java testing VPC membership validation
- [ ] Create src/test/java/com/netflix/simianarmy/aws/conformity/rule/InstanceTooOldTest.java for instance age calculation logic
- [ ] Create src/test/java/com/netflix/simianarmy/aws/conformity/rule/InstanceIsHealthyInEurekaTest.java mocking Eureka client responses
- [ ] Mock AWSClusterCrawler and ConformityEurekaClient dependencies appropriately
Add GitHub Actions workflow to replace deprecated Travis CI and improve dependency security scanning
The project uses Travis CI (configured in .travis.yml) which has been acquired and deprecated. Additionally, there's no automated dependency vulnerability scanning. A modern GitHub Actions workflow would provide better integration with GitHub's security features, faster CI feedback, and automatic updates via Dependabot. This is especially valuable for a retired project to ensure security patches are applied to existing deployments.
- [ ] Create .github/workflows/build.yml to run './gradlew build' on push and PRs, replacing Travis functionality
- [ ] Add test execution and code coverage reporting (using the existing cobertura plugin configuration)
- [ ] Create .github/workflows/security.yml or enable Dependabot configuration in dependabot.yml to track vulnerable dependencies in aws-java-sdk, spring-jdbc, and other outdated versions
- [ ] Update README.md to replace Travis badge with GitHub Actions badge
- [ ] Document any secret management needed for AWS credentials or deployment artifacts
🌿Good first issues
- Add test coverage for src/main/java/com/netflix/simianarmy/aws/AWSEmailNotifier.java—currently no testCompile deps for email testing (add mock SMTP server tests similar to existing Mockito patterns in testng suites)
- Document the simianarmy.properties schema by creating example-simianarmy.properties in repo root with all available configuration keys for each Monkey type, since the config format is undocumented and users must infer from code
- Implement a MongoDB MonkeyRecorder as alternative to deprecated SimpleDB and RDS—add src/main/java/com/netflix/simianarmy/aws/MongoDBRecorder.java extending MonkeyRecorder interface using mongo-java-driver dependency
⭐Top contributors
Click to expand
Top contributors
- @jeyrschabu — 32 commits
- @ebukoski — 32 commits
- [@Lorin Hochstein](https://github.com/Lorin Hochstein) — 9 commits
- @lorin — 5 commits
- @robfletcher — 5 commits
📝Recent commits
Click to expand
Recent commits
e8eb9f3— Fix link to NetflixOSS badge (Lorin Hochstein)232d61c— Merge pull request #340 from Netflix/set-lifecycle-archived (lorin)b65ded6— bold that project is not actively maintained (Lorin Hochstein)cbc9c01— make other titles less prominent (Lorin Hochstein)7486da3— finish incomplete sentence (Lorin Hochstein)81732fa— Make the status more visible (Lorin Hochstein)7e269ac— Mark SimianArmy as archived (Lorin Hochstein)022ad07— Merge pull request #331 from nacx/aws-temp-creds (lorin)ecc4147— Merge pull request #314 from Netflix/remove-image-link (lorin)239b1e6— Merge pull request #334 from archthegit/network-script-issue (lorin)
🔒Security observations
- Critical · Outdated AWS SDK with Known Vulnerabilities —
build.gradle - com.amazonaws:aws-java-sdk:1.11.28. The project uses AWS SDK version 1.11.28, which is significantly outdated (released in 2016) and contains multiple known security vulnerabilities. This version lacks security patches for credential handling, SSL/TLS validation, and other critical issues. Fix: Update to the latest AWS SDK v2.x (2.20+) which includes security patches, improved credential handling, and modern TLS support. Perform compatibility testing before upgrading. - Critical · Outdated Jackson Libraries with Deserialization Vulnerabilities —
build.gradle - org.codehaus.jackson:jackson-core-asl:1.9.2 and jackson-mapper-asl:1.9.2. Jackson 1.9.2 (from 2012) has multiple known deserialization vulnerabilities (CVE-2013-0269, CVE-2015-4852, etc.) that can lead to remote code execution when processing untrusted JSON data. Fix: Upgrade to Jackson 2.15+ (com.fasterxml.jackson.:jackson-:2.15.2). This is a major version change requiring code updates but is essential for security. - High · Outdated Jersey with Security Issues —
build.gradle - com.sun.jersey:jersey-servlet:1.19. Jersey 1.19 (from 2014) lacks security patches for RESTful endpoint handling and has known vulnerabilities in HTTP request processing. Fix: Migrate to Jersey 2.35+ or consider using modern alternatives like Spring REST or Quarkus REST. - High · Outdated Apache HttpClient —
build.gradle - org.apache.httpcomponents:httpclient:4.3. HttpClient 4.3 (from 2014) contains known vulnerabilities in SSL/TLS handling and request validation. Current version is 4.5.x+. Fix: Update to HttpClient 4.5.13+ (latest 4.5.x) or 5.x series. Verify compatibility with Jersey and other HTTP-dependent components. - High · Outdated SLF4J and Log4j Configuration —
build.gradle - org.slf4j:slf4j-api:1.7.2 and org.slf4j:slf4j-log4j12:1.6.1. SLF4J 1.6.1 (runtime) and 1.7.2 (compile) are outdated. Combined with log4j12 binding, this could expose the project to Log4Shell-like vulnerabilities if log4j is an indirect dependency. Fix: Update to SLF4J 2.0+ and ensure log4j is at 2.17.1+. Review all logging configurations for externalized log injection risks. - High · Outdated Guava Library —
build.gradle - com.google.guava:guava:11.0.2. Guava 11.0.2 (from 2012) is severely outdated and may contain resource exhaustion and other vulnerabilities. Current version is 31+. Fix: Update to Guava 31.1+ (or latest stable version). This is generally a safe dependency upgrade with good backward compatibility. - High · Outdated JClouds Libraries —
build.gradle - org.apache.jclouds.* dependencies version 1.9.0. JClouds 1.9.0 (from 2016) lacks modern security features and has known vulnerabilities in cloud provider credential handling and API communication. Fix: Update to JClouds 2.4.0+ which includes improved security handling, modern TLS, and better credential management. - High · Outdated Test Dependencies —
build.gradle - org.testng:testng:6.3.1 and org.mock. TestNG 6.3.1 (from 2013) and Mockito 1.8.5 (from 2012) are extremely outdated and may have security issues. Mockito 1.x has known vulnerabilities in test setup. Fix: undefined
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.