rancher/os
Tiny Linux distro that runs the entire OS as Docker containers
Healthy across all four use cases
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓15 active contributors
- ✓Apache-2.0 licensed
- ✓CI configured
Show all 6 evidence items →Show less
- ✓Tests present
- ⚠Stale — last commit 2y ago
- ⚠Concentrated ownership — top contributor handles 66% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/rancher/os)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/rancher/os on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: rancher/os
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/rancher/os shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- 15 active contributors
- Apache-2.0 licensed
- CI configured
- Tests present
- ⚠ Stale — last commit 2y ago
- ⚠ Concentrated ownership — top contributor handles 66% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live rancher/os
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/rancher/os.
What it runs against: a local clone of rancher/os — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in rancher/os | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 729 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of rancher/os. If you don't
# have one yet, run these first:
#
# git clone https://github.com/rancher/os.git
# cd os
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of rancher/os and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "rancher/os(\\.git)?\\b" \\
&& ok "origin remote is rancher/os" \\
|| miss "origin remote is not rancher/os (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "cmd/control/entrypoint.go" \\
&& ok "cmd/control/entrypoint.go" \\
|| miss "missing critical file: cmd/control/entrypoint.go"
test -f "cmd/init/init.go" \\
&& ok "cmd/init/init.go" \\
|| miss "missing critical file: cmd/init/init.go"
test -f "cmd/control/docker_init.go" \\
&& ok "cmd/control/docker_init.go" \\
|| miss "missing critical file: cmd/control/docker_init.go"
test -f "cmd/sysinit/sysinit.go" \\
&& ok "cmd/sysinit/sysinit.go" \\
|| miss "missing critical file: cmd/sysinit/sysinit.go"
test -f "config/cloudinit/config/config.go" \\
&& ok "config/cloudinit/config/config.go" \\
|| miss "missing critical file: config/cloudinit/config/config.go"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 729 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~699d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/rancher/os"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
RancherOS is a minimal Linux distribution that runs the entire operating system as Docker containers, including system services like udev and rsyslog. It uses a dual-Docker architecture (system Docker + user Docker) to avoid accidental deletion of OS services and dramatically reduces the OS footprint to only the bare minimum needed to boot and run Docker. This enables ultra-lightweight, container-native infrastructure at scale. Monorepo with cmd/control/ as the main orchestration entrypoint (bootstrap, docker_init, console_init, install logic). cmd/cloudinitsave/ and cmd/cloudinitexecute/ handle cloud-init integration. assets/ contains SELinux configs, SSH keys, and Grub/Syslinux bootloaders. The Dockerfile.dapper indicates build happens in a containerized Dapper environment (Rancher's build tool).
👥Who it's for
Infrastructure operators and DevOps teams running large-scale containerized workloads who want to eliminate OS bloat and maximize resource efficiency on bare metal or cloud VMs. Users who embrace container-first workflows and don't need traditional Linux package management or ISV software support.
🌱Maturity & risk
RancherOS 1.x is officially end-of-life and no longer actively maintained (as documented in README). The final release is v1.5.8 with Docker 19.03.15 and Linux 4.14.138. While it was production-ready at its peak, the project is now deprecated due to Docker ecosystem fragmentation (containerd/CRIO adoption) and lack of ISV certification. This is a legacy project; do not use for new deployments.
High risk for new projects. No active maintenance means security patches, kernel updates, and Docker compatibility fixes are frozen. The codebase shows heavy Go+Shell coupling (745KB Go, 113KB Shell) with potential brittle boot-time orchestration logic. Rancher is explicitly developing a replacement, signaling this architecture is fundamentally obsolete. Dependency on Docker-only design conflicts with industry shift to containerd/CRIO.
Active areas of work
Nothing. Development is stopped. The README explicitly states 'RancherOS 1.x is no longer being actively maintained' and Rancher is 'working on a replacement.' The .drone.yml suggests old CI/CD setup. No active PRs or commits visible in provided metadata.
🚀Get running
git clone https://github.com/rancher/os.git
cd os
make build # Uses Dockerfile.dapper for containerized build
Note: This is not recommended for new projects. Use only for historical reference or maintenance of existing RancherOS 1.x deployments.
Daily commands:
RancherOS doesn't run as a development server; it's a full OS distribution. To build: make build (containerized via Dapper). To test locally: download ISO from releases.rancher.com and boot in QEMU/VirtualBox/KVM. The binary release includes kernel (vmlinuz), initrd, and ISO images for various platforms (VMware, Hyper-V, 4G LTE, AWS).
🗺️Map of the codebase
cmd/control/entrypoint.go— Primary entry point for RancherOS control daemon—orchestrates boot sequence, cloud-init, and system initialization.cmd/init/init.go— Core init system that mounts filesystems, configures networking, and spawns the control daemon as the first user process.cmd/control/docker_init.go— Bootstraps Docker daemon and system containers—critical for understanding how RancherOS runs the entire OS as Docker containers.cmd/sysinit/sysinit.go— System initialization module that loads kernel modules, configures hardware, and prepares the container runtime environment.config/cloudinit/config/config.go— Cloud-init configuration parser—handles user-data, network setup, and system provisioning directives.cmd/control/install/install.go— OS installation logic—manages bootloader setup, filesystem layout, and persistent system deployment.Dockerfile.dapper— Build environment definition—demonstrates how RancherOS is containerized and compiled using Dapper.
🛠️How to make changes
Add a new system service
- Create a new Go package under cmd/ (e.g., cmd/myservice/myservice.go) with a Main() function (
cmd/myservice/myservice.go) - Register the service in cmd/control/entrypoint.go by adding it to the service initialization loop (
cmd/control/entrypoint.go) - Define the service spec in config YAML (cloud-config format) handled by config/cloudinit/config/config.go (
config/cloudinit/config/config.go) - Add service container image and environment to cmd/control/service/service.go for Docker container spawning (
cmd/control/service/service.go)
Add a new cloud-init configuration directive
- Create a new config module (e.g., config/cloudinit/config/myfeature.go) with parsing logic (
config/cloudinit/config/myfeature.go) - Register parser in config/cloudinit/config/config.go in the main Parse() function (
config/cloudinit/config/config.go) - Implement execution logic in cmd/cloudinitexecute/cloudinitexecute.go to apply the configuration (
cmd/cloudinitexecute/cloudinitexecute.go) - Add validation rules in config/cloudinit/config/validate/ if the directive requires schema validation (
config/cloudinit/config/validate/node.go)
Add support for a new bootloader
- Create a new bootloader module (e.g., cmd/control/install/mybootloader.go) following grub.go and syslinux.go patterns (
cmd/control/install/mybootloader.go) - Register bootloader in cmd/control/install/install.go's Install() function with OS/firmware detection (
cmd/control/install/install.go) - Implement Write() and Update() methods to generate bootloader config files on disk (
cmd/control/install/install.go)
Add a new control CLI command
- Create a new handler function in cmd/control/ (e.g., cmd/control/mycommand.go) matching existing command patterns (
cmd/control/mycommand.go) - Register the command in cmd/control/cli.go's main command switch or subcommand parser (
cmd/control/cli.go) - Add service interaction logic using cmd/control/engine.go if the command needs to manage Docker containers (
cmd/control/engine.go)
🔧Why these technologies
- Go — Compiles to small, fast binaries suitable for minimal OS init and system daemons; excellent concurrency for managing multiple containers
- Docker — Core design principle: run the entire OS (rootfs, system services, console) as Docker containers for immutability and reproducibility
- Cloud-Init — Industry-standard provisioning tool; enables automated system configuration, SSH key injection, and user-data processing across cloud providers
- GRUB/SYSLINUX — Bootloader support for persistent installation and multi-boot scenarios; GRUB for UEFI/legacy, SYSLINUX for ultra-minimal setups
- systemd units (via cloud-config) — Familiar service definition format for users; integrates with existing ecosystem tooling
⚖️Trade-offs already made
-
Entire OS as Docker containers (no traditional package manager)
- Why: Simplicity, immutability, and reproducibility—every component is a container with known dependencies
- Consequence: Cannot run ISV software designed for general-purpose Linux distros; limited to containerized workloads; no ad-hoc package installation
-
Minimal kernel + init (cmd/init) + control daemon (cmd/control)
- Why: Reduce attack surface and boot time; avoid traditional init systems (systemd) overhead
- Consequence: Custom init logic increases maintenance burden; less feature-rich than systemd but smaller footprint
-
Docker-only container runtime (no containerd/CRI-O support)
- Why: RancherOS 1.x designed exclusively around Docker Engine
- Consequence: Cannot pivot to modern container runtimes without major architectural changes; product deprecated in favor of next-gen replacement
-
Cloud-init for provisioning rather than custom configuration format
- Why: Leverage existing ecosystem, cloud provider integrations, and user familiarity
- Consequence: Tight coupling to cloud-init parser; some features may be overkill for a minimal OS
🚫Non-goals (don't propose these)
- General-purpose Linux distribution—RancherOS is purpose-built for container workloads only
- Traditional package management (apt, yum, pacman)—all software is containerized
- Support for non-containerized applications or ISV software not certified for containers
- Real-time operating system capabilities—no PREEMPT_RT or hard real-time scheduling
- Compatibility with systemd ecosystem expectations
🪤Traps & gotchas
Critical non-obvious issues: (1) Dual-Docker architecture requires careful initialization order—system Docker must come up cleanly before launching user Docker container, or entire boot fails silently. (2) Cloud-init integration is non-standard; uses cloudinitsave as intermediate rather than standard cloud-init hooks. (3) SELinux policies in assets/selinux/ must match kernel version exactly or containers won't start. (4) Bootloader logic (Grub vs Syslinux) varies by platform; wrong choice in install.go causes unbootable images. (5) No systemd—all service management happens via Docker container orchestration in cmd/control/service/; unfamiliar to most Linux engineers. (6) Requires specific Docker versions; Docker 19.03.15 is pinned in v1.5.8; newer Docker may break compatibility. (7) SSH config (assets/scripts_ssh_config) is statically baked; dynamic SSH keys require cloud-init.
🏗️Architecture
💡Concepts to learn
- Dual-Docker architecture (system Docker + user Docker) — This is RancherOS's core innovation—separating system services from user workloads prevents catastrophic OS deletion; understanding this separation is essential to modifying any boot logic
- Container-as-init-system (PID 1 is a container) — RancherOS replaces traditional init (systemd/sysvinit) with Docker itself; this requires careful understanding of signal handling, zombie reaping, and container lifecycle in cmd/control/
- Immutable infrastructure (read-only root filesystem concept) — RancherOS pushes toward immutability—OS changes via image rebuild, not package install; this design philosophy affects how config and updates work in cmd/control/install/
- Cloud-init (VM initialization standard) — RancherOS integrates cloud-init for AWS/Azure/GCP provisioning; cmd/cloudinitsave/ and cmd/cloudinitexecute/ implement non-standard cloud-init hooks that differ from typical Linux distros
- Bootloader abstraction (Grub vs Syslinux) — cmd/control/install/ must support multiple bootloaders for different platforms (VMware, Hyper-V, BIOS); incorrect bootloader choice in grub.go or syslinux.go results in unbootable systems
- SELinux mandatory access control (MAC) — assets/selinux/ defines custom SELinux policies for container-native OS; understanding MAC contexts is critical for hardened deployments and debugging permission denials
- Linux kernel udev subsystem — RancherOS runs udev as a container service (cmd/control/udevsettle.go); understanding udev is essential for device hotplug and hardware initialization
🔗Related repos
rancher/k3os— Spiritual successor to RancherOS; Rancher's lightweight OS for Kubernetes instead of Docker, addresses the container runtime evolution problemopencontainers/runc— Low-level container runtime that RancherOS relies on; understanding runc is essential for debugging container init failuresmoby/moby— Docker Engine source; RancherOS tightly couples to specific Docker versions, making Docker internals relevant to troubleshootingrancher/rancher— Rancher Kubernetes platform that originally managed RancherOS clusters; understanding Rancher's needs clarifies why RancherOS was designed this wayflatcar/flatcar-linux— Modern minimal Linux distro for containers (successor to CoreOS) that addresses similar use cases but with better maintenance and broader container runtime support
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add unit tests for cmd/control/config.go and config_test.go expansion
The repo has cmd/control/config_test.go but it's likely incomplete given the complexity of RancherOS's configuration system. The config.go file handles critical OS configuration logic. Adding comprehensive table-driven tests for config parsing, validation, and merging would catch regressions in this high-risk area.
- [ ] Review existing tests in cmd/control/config_test.go to identify gaps
- [ ] Add test cases for edge cases in config parsing (malformed YAML, missing required fields, type mismatches)
- [ ] Add tests for config merging logic when multiple sources exist
- [ ] Add tests for cloud-init config loading in cmd/cloudinitsave/cloudinitsave.go integration
- [ ] Ensure tests cover the config struct fields used by bootstrap.go and entrypoint.go
Add GitHub Actions CI workflow for multi-platform Docker builds
The repo has .drone.yml for CI but lacks modern GitHub Actions workflows. Given this is a Docker-based distro with Dockerfile.dapper, adding a GitHub Actions workflow to build and test the Dockerfile on both amd64 and arm64 platforms would improve visibility, reduce CI dependency on external services, and catch platform-specific build issues.
- [ ] Create .github/workflows/docker-build.yml
- [ ] Configure matrix strategy to build on ubuntu-latest for amd64 and add arm64 via QEMU or native runners
- [ ] Add build-push step that validates the Dockerfile.dapper builds successfully
- [ ] Add optional push to GitHub Container Registry (ghcr.io) on main branch
- [ ] Reference the maintenance status in workflow documentation comments
Add integration tests for cmd/control service command execution flow
The service subsystem (cmd/control/service/) handles critical application lifecycle management. Currently there are no visible tests for the command execution, service app loading, or the integration between service.go and app.go. This is high-risk code that deserves test coverage to prevent regressions when the project potentially transitions to other container runtimes.
- [ ] Create cmd/control/service/service_test.go with mock implementations of the app interface
- [ ] Add tests for service registration, lookup, and execution flows in cmd/control/service/service.go
- [ ] Add tests for command parsing and routing in cmd/control/service/command/command.go
- [ ] Add tests for app lifecycle (start, stop, restart) in cmd/control/service/app/app.go
- [ ] Test error handling for missing or malformed service configurations
🌿Good first issues
- Add comprehensive unit tests to
cmd/control/config_test.go—currently only one test file exists in cmd/control/ despite 40+ Go files; configuration parsing is critical and under-tested. - Document the dual-Docker initialization sequence in ARCHITECTURE.md with sequence diagrams showing entrypoint.go → docker_init.go → console_init.go transitions; currently undocumented complexity.
- Audit and document all environment variables required at boot time in
cmd/control/env.goand create a troubleshooting guide; currently scattered across multiple files with no centralized reference.
⭐Top contributors
Click to expand
Top contributors
- @niusmallnan — 66 commits
- @rancher-sy-bot — 10 commits
- @dweomer — 6 commits
- [@Morten Møller Riis](https://github.com/Morten Møller Riis) — 3 commits
- [@Craig Jellick](https://github.com/Craig Jellick) — 2 commits
📝Recent commits
Click to expand
Recent commits
36529b5— Update README.md (#3067) (Craig Jellick)397e2df— Update README.md (#3066) (Craig Jellick)fcdc9ca— Merge pull request #3049 from dweomer/master-v1.5.8-ami (dweomer)719722a— v1.5.8: bump AMIs (dweomer)628e6ab— Merge pull request #3038 from dweomer/v1.5.7-amis-on-master (dweomer)4b9e8ac— v1.5.7: bump AMIs (dweomer)b3d0b29— Merge pull request #3015 from dweomer/update-readme-v1.5.6 (dweomer)93431c9— readme: update for v1.5.6 (dweomer)2ec7f4f— Bump kernel 4.14.176-rancher (niusmallnan)a58d390— Bump up user docker 19.03.8 (jokay)
🔒Security observations
- High · Hardcoded SSH Private Key in Repository —
assets/rancher.key. The file 'assets/rancher.key' appears to be a private SSH key committed to the repository. This is a critical security risk as it exposes the private key to anyone with access to the repository, including potential attackers. Private keys should never be committed to version control. Fix: 1. Immediately revoke the exposed private key. 2. Remove the file from git history using 'git filter-branch' or 'BFG Repo-Cleaner'. 3. Add '.key' and 'rancher.key' to .gitignore. 4. Generate new SSH keypairs and distribute securely. 5. Implement pre-commit hooks to prevent accidental commits of sensitive files. - High · Maintenance Status and Unpatched Dependencies —
README.md, config/cloudinit/. RancherOS 1.x is explicitly noted as 'no longer being actively maintained' in the README. This means security vulnerabilities discovered in dependencies will not be patched, creating long-term security risks. The codebase contains vendored/embedded dependencies (e.g., config/cloudinit/) that may contain known vulnerabilities. Fix: 1. Upgrade to the replacement product when available. 2. For existing systems, implement strict network access controls and isolation. 3. Regularly audit dependencies (particularly cloudinit) for known CVEs. 4. Consider using a SBOM tool to track all components. 5. Plan immediate migration strategy away from RancherOS 1.x. - Medium · Embedded Cloud-Init Configuration —
config/cloudinit/. The codebase includes an embedded/vendored version of cloud-init (config/cloudinit/). Cloud-init has historically had multiple security vulnerabilities. Using an embedded version may delay security updates and patch deployment. Fix: 1. Audit the version of cloud-init included for known CVEs. 2. If possible, use the system-provided cloud-init package instead of embedded version. 3. Implement a process to regularly update embedded dependencies. 4. Monitor cloud-init security advisories closely. - Medium · Potential SSH Key Authorization Vulnerability —
cmd/cloudinitexecute/authorize_ssh_keys.go. The file 'cmd/cloudinitexecute/authorize_ssh_keys.go' handles SSH key authorization. Without reviewing the implementation, there is risk of improper validation, injection attacks, or privilege escalation if SSH keys are not properly validated before authorization. Fix: 1. Review the implementation for proper input validation. 2. Ensure SSH keys are cryptographically validated. 3. Implement strict allowlisting of authorized key formats. 4. Add comprehensive logging and audit trails for SSH key operations. 5. Consider using a dedicated SSH key management library. - Medium · TLS Configuration Handling —
cmd/control/tlsconf.go. The file 'cmd/control/tlsconf.go' handles TLS configuration. Improper TLS configuration could lead to man-in-the-middle attacks, weak cipher suites, or certificate validation bypass. Fix: 1. Verify TLS version is 1.2 or higher (preferably 1.3). 2. Ensure strong cipher suites are used. 3. Implement certificate pinning where applicable. 4. Validate certificate chains properly. 5. Use Go's crypto/tls best practices. 6. Add security tests for TLS configuration. - Medium · Docker Initialization Security —
cmd/control/docker_init.go. The file 'cmd/control/docker_init.go' handles Docker daemon initialization. Improper Docker socket permissions, daemon configuration, or authentication could lead to unauthorized container execution or system compromise. Fix: 1. Ensure Docker socket is only accessible to authorized users. 2. Implement proper authentication/authorization. 3. Run Docker daemon with minimal required privileges. 4. Validate all configuration parameters. 5. Implement audit logging for Docker operations. - Low · SELinux Configuration Included —
assets/selinux/. While SELinux configuration files are present (assets/selinux/), their proper enforcement status and effectiveness is unclear. If not properly configured, SELinux may provide false sense of security. Fix: 1. Verify SELinux is running in 'enfor
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.