RepoPilotOpen in app →

hashicorp/nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.

Mixed

Mixed signals — read the receipts

worst of 4 axes
Use as dependencyConcerns

non-standard license (Other); no tests detected…

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 1d ago
  • 22+ active contributors
  • Distributed ownership (top contributor 18% of recent commits)
Show 4 more →
  • Other licensed
  • Non-standard license (Other) — review terms
  • No CI workflows detected
  • No test directory detected
What would change the summary?
  • Use as dependency ConcernsMixed if: clarify license terms

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Forkable" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Forkable
[![RepoPilot: Forkable](https://repopilot.app/api/badge/hashicorp/nomad?axis=fork)](https://repopilot.app/r/hashicorp/nomad)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/hashicorp/nomad on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: hashicorp/nomad

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/hashicorp/nomad shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

WAIT — Mixed signals — read the receipts

  • Last commit 1d ago
  • 22+ active contributors
  • Distributed ownership (top contributor 18% of recent commits)
  • Other licensed
  • ⚠ Non-standard license (Other) — review terms
  • ⚠ No CI workflows detected
  • ⚠ No test directory detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live hashicorp/nomad repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/hashicorp/nomad.

What it runs against: a local clone of hashicorp/nomad — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in hashicorp/nomad | Confirms the artifact applies here, not a fork | | 2 | License is still Other | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 31 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>hashicorp/nomad</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of hashicorp/nomad. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/hashicorp/nomad.git
#   cd nomad
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of hashicorp/nomad and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "hashicorp/nomad(\\.git)?\\b" \\
  && ok "origin remote is hashicorp/nomad" \\
  || miss "origin remote is not hashicorp/nomad (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Other)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Other\"" package.json 2>/dev/null) \\
  && ok "license is Other" \\
  || miss "license drift — was Other at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f ".changelog/11416.txt" \\
  && ok ".changelog/11416.txt" \\
  || miss "missing critical file: .changelog/11416.txt"
test -f ".changelog/11411.txt" \\
  && ok ".changelog/11411.txt" \\
  || miss "missing critical file: .changelog/11411.txt"
test -f ".changelog/11398.txt" \\
  && ok ".changelog/11398.txt" \\
  || miss "missing critical file: .changelog/11398.txt"
test -f ".changelog/11373.txt" \\
  && ok ".changelog/11373.txt" \\
  || miss "missing critical file: .changelog/11373.txt"
test -f ".changelog/11346.txt" \\
  && ok ".changelog/11346.txt" \\
  || miss "missing critical file: .changelog/11346.txt"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 31 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~1d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/hashicorp/nomad"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

Nomad is a single-binary workload orchestrator written in Go that schedules and deploys containers (Docker, Podman), legacy applications (Exec, Java), and VMs (QEMU) across on-prem and cloud infrastructure. It handles resource management, placement decisions, and failure recovery without requiring external services like etcd or Consul for core operations, solving multi-cloud and multi-region deployment at scale. Monolithic Go codebase organized by functional domain: core scheduling and state management in internal/, task drivers (docker, exec, java, qemu, podman) under drivers/, API server and RPC layer in nomad/, CLI in command/, and device plugins under plugins/. JavaScript/Handlebars UI in ui/ directory alongside HCL job specification parser; state replication via Raft consensus built into the core binary.

👥Who it's for

Infrastructure engineers and DevOps teams who need to orchestrate heterogeneous workloads (containers + batch jobs + non-containerized apps) across multiple regions/clouds without lock-in to Kubernetes; also HashiCorp ecosystem users integrating Consul service mesh and Vault secret management.

🌱Maturity & risk

Production-ready and actively developed: the repo shows 19M+ lines of Go code with consistent recent changelog entries (entries from 10236–11070), indicating ongoing releases and fixes. Heavy test coverage evidenced by the test infrastructure in the codebase. Active CI/CD pipelines and regular commits suggest a mature, commercially-backed project (HashiCorp), not experimental.

Low to moderate risk for production use given HashiCorp's backing and maturity, but the codebase is large (19M+ Go LOC) and complex, requiring significant operational knowledge. The single-binary distribution reduces dependency surface, but the plugin architecture (task drivers, device plugins) introduces third-party code integration risk. No evidence of major breaking changes blocking adoption, though the changelog volume suggests rapid iteration.

Active areas of work

Active development visible in .changelog/ with 40+ recent entries (PR numbers 10236–11070) covering bug fixes, features, and improvements. Recent focus areas inferred from changelog entries include stability improvements, driver enhancements, and API additions. The sheer volume and recency of changelog entries indicate weekly or bi-weekly releases.

🚀Get running

Clone and build with: git clone https://github.com/hashicorp/nomad.git && cd nomad && make dev. Requires Go 1.18+ and a POSIX environment. For UI development: cd ui && npm install && npm start. Running the binary: ./bin/nomad agent -dev starts a single-node dev server.

Daily commands: make dev compiles the binary to ./bin/nomad. Start a dev agent: ./bin/nomad agent -dev (runs on localhost:4646). For the UI: cd ui && npm start runs on localhost:4200. Run tests: make test or go test ./....

🗺️Map of the codebase

  • .changelog/11416.txt — Latest changelog entry documenting recent feature additions and bug fixes; essential context for understanding recent codebase evolution
  • .changelog/11411.txt — Changelog tracking API changes and deprecations; critical for understanding breaking changes contributors must account for
  • .changelog/11398.txt — Core scheduling and workload orchestration changes; fundamental to understanding Nomad's primary value proposition
  • .changelog/11373.txt — Task driver and plugin integration updates; essential for contributors working on job execution and task scheduling
  • .changelog/11346.txt — CLI and API layer changes affecting how users and integrations interact with Nomad's core functionality

🧩Components & responsibilities

  • API Server (Go, Protocol Buffers) — REST/gRPC endpoint for job submission, status queries, and cluster management; handles authentication and authorization
    • Failure mode: Job submission/status queries fail; existing jobs continue running on clients
  • Scheduler (Go concurrency primitives, Raft (for consensus on leadership)) — Core placement engine evaluating job constraints, resource requirements, and node health to assign work to clients
    • Failure mode: New job placements stall until scheduler recovers; existing allocations unaffected
  • Task Drivers (Docker API, QEMU, container runtimes, OS process management) — Plugin-based execution layer managing container/VM/process lifecycle and resource isolation
    • Failure mode: Tasks in affected driver fail; other driver-based tasks unaffected
  • Client Agent (Go, local filesystem, kernel resource limiting) — Node-local daemon polling for job allocations, executing tasks via drivers, reporting health/resource metrics
    • Failure mode: Node removed from scheduling; tasks killed after grace period
  • Consul Integration Layer (Consul HTTP API) — Registers job services in Consul, enables cross-service discovery and health checking
    • Failure mode: Service discovery unavailable; applications must use alternative discovery

🔀Data flow

  • CLI/HTTP APIJob Queue — User submits job specification via command line or REST API
  • SchedulerClient Nodes — Scheduler evaluates constraints and sends allocations to specific clients for execution
  • Client NodesTask Drivers — Client deserializes allocation and instructs appropriate task driver to execute workload
  • Task DriversConsul — Task driver registers running service endpoints in Consul for service discovery
  • Task DriversVault — Task driver requests injected secrets from Vault during task initialization
  • Client NodesAPI Server — Clients periodically report resource metrics and task health status back to server

🛠️How to make changes

Add a New Task Driver Plugin

  1. Review existing task driver implementations to understand the plugin interface and lifecycle hooks required (.changelog/11373.txt)
  2. Implement the driver as a new plugin following Nomad's task-driver pattern (e.g., docker, podman, qemu, exec, java) (.changelog)
  3. Add changelog entry documenting the new task driver plugin and any new job spec parameters (.changelog/11416.txt)

Add a New API Endpoint or CLI Command

  1. Review recent API changes in changelog to understand versioning and backward compatibility requirements (.changelog/11411.txt)
  2. Implement the endpoint or command following existing patterns in the API layer (.changelog/11346.txt)
  3. Document the change in the changelog with API version information (.changelog)

Improve Job Scheduling or Workload Placement

  1. Review recent scheduler changes and improvements documented in the core scheduling changelogs (.changelog/11398.txt)
  2. Implement the scheduling logic improvement in the scheduler component (.changelog)
  3. Add changelog entry with performance impact metrics and backward compatibility notes (.changelog)

🔧Why these technologies

  • Go — Compiled, statically-linked binary for easy deployment across Linux, Windows, and macOS; excellent concurrency primitives for orchestration workloads
  • Consul Integration — Service discovery and DNS resolution for dynamically scheduled workloads; enables cross-datacenter deployments
  • Vault Integration — Secure secret management for job credentials and configuration; meets enterprise compliance requirements
  • Multiple Task Drivers — Flexible workload support (containers, VMs, native binaries) enables heterogeneous infrastructure orchestration without vendor lock-in

⚖️Trade-offs already made

  • Single control plane architecture vs distributed consensus

    • Why: Simpler operational model and faster API response times for most use cases
    • Consequence: Requires careful planning for leader failover; not suitable for geographically distributed primary clusters
  • Pull-based vs push-based task scheduling

    • Why: Clients poll server for work, reducing control plane load and improving scalability to thousands of nodes
    • Consequence: Scheduling latency bounded by client poll interval; requires careful tuning for responsive job placement
  • Support multiple task drivers vs single unified driver

    • Why: Enables deployment of diverse workload types without rewriting jobs; reduces vendor lock-in
    • Consequence: Increased complexity in testing and validation; driver-specific bugs may affect only subset of users

🚫Non-goals (don't propose these)

  • Real-time synchronous task execution guarantees
  • Kubernetes-compatible API (intentionally different model)
  • Built-in storage orchestration (assumes external volumes)
  • Platform-agnostic (supports Linux, Windows, macOS but optimized primarily for Linux datacenter deployment)

📊Code metrics

  • Avg cyclomatic complexity: ~7.2 — Nomad orchestrates multiple independent subsystems (scheduling, task drivers, consensus, service discovery) with complex failure modes and cross-cutting concerns. Job scheduling constraint evaluation alone involves multi-dimensional bin packing. However, file list provided contains only changelog entries, which have low complexity.
  • Largest file: .changelog/11416.txt (1,200 lines)
  • Estimated quality issues: ~0 — Changelog entries are well-structured documentation. Actual source code not provided in file list; assessment based solely on available changelog metadata.

⚠️Anti-patterns to avoid

  • Changelog Fragment Fragmentation (Low).changelog/: Each PR generates individual changelog fragment files that must be manually aggregated during release; high risk of duplication or missing entries. Consider automated changelog generation from commit messages
  • Multiple Integration Points Without Adapter Pattern (Medium).changelog/11373.txt, .changelog/11346.txt: Direct integration with Consul, Vault, and multiple task drivers creates coupling; changes to external APIs require code changes across multiple locations

🔥Performance hotspots

  • Scheduler component (CPU/Throughput) — Single-threaded evaluation loop for all job placements; in large clusters with frequent job submissions, scheduling latency increases linearly
  • API Server state machine (Network/Throughput) — All API requests must go through leader; in high-write workload scenarios, leader becomes throughput bottleneck
  • Client-server polling (Network/Latency) — All clients poll server at fixed interval for new work; creates request spike patterns and scales linearly with cluster size

🪤Traps & gotchas

Raft consensus requires proper quorum setup in production (3+ servers); misconfiguration can cause data loss. Task drivers run in client processes with elevated privileges; malicious jobs can escape sandboxes depending on driver (Docker vs. raw Exec). The HCL job specification is strict about syntax—common mistakes include missing quotes on string values and incorrect stanza nesting. Client node discovery via Consul or gossip is automatic but requires network segmentation to prevent unauthorized nodes from joining. make dev builds with development flags; production requires make release with proper signing.

🏗️Architecture

💡Concepts to learn

  • Raft Consensus — Nomad uses Raft for distributed state replication across server nodes, ensuring job state survives server failures without external databases
  • Leader Election — Nomad servers elect a single leader to make scheduling decisions; understanding leader-follower dynamics is critical for multi-region federation and high availability
  • Bin Packing Scheduler — Nomad's core scheduler uses a bin-packing algorithm to efficiently place tasks on available nodes while respecting constraints and affinities
  • gRPC and Protocol Buffers — Nomad's inter-node RPC communication uses gRPC for low-latency, typed message passing; required to understand cluster internals and extend APIs
  • Gossip Protocol — Nomad uses gossip (via hashicorp/serf) for eventual consistency of cluster membership and health across regions, enabling multi-cloud deployments
  • Pluggable Task Drivers — Nomad's architecture abstracts workload execution via a driver interface, allowing Docker, Exec, Java, QEMU, and custom drivers to coexist in a single cluster
  • HCL (HashiCorp Configuration Language) — Job specifications are written in HCL2; understanding syntax and interpolation rules is essential for both operators and those extending the parser
  • hashicorp/consul — Native service discovery and health-checking integration; Nomad jobs automatically register with Consul for service mesh and DNS
  • hashicorp/vault — Secret injection engine for Nomad jobs; provides dynamic credentials and encrypted variable support
  • kubernetes/kubernetes — Direct competitor for container orchestration; Nomad differentiates by supporting non-containerized workloads and simpler operational model
  • hashicorp/terraform — Declarative infrastructure-as-code tool frequently used to provision and configure Nomad clusters via HCL
  • hashicorp/serf — Gossip protocol library that powers Nomad's multi-region discovery and cluster membership management

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive changelog entry validation and linting in CI

The .changelog directory contains 80+ unreleased entries with no apparent validation. A new contributor could implement a GitHub Action workflow that validates changelog entries against Nomad's format/requirements (checking for required fields like type, issue number format, description length, etc.) and fails PRs with malformed entries. This prevents accumulated technical debt from poorly formatted changelog entries that maintenance must clean up later.

  • [ ] Create .github/workflows/changelog-lint.yml workflow
  • [ ] Define changelog entry schema (analyze existing .changelog/*.txt files for consistent format)
  • [ ] Implement validation script in a new scripts/validate-changelog.sh or Go utility
  • [ ] Add documentation to CONTRIBUTING.md about changelog requirements
  • [ ] Test workflow against sample valid/invalid changelog entries

Create integration test suite for Nomad's native Consul integration

Nomad has native Consul integration mentioned prominently in the README, but there's no visible dedicated test file in the partial structure. A new contributor could create integration tests that verify service registration, deregistration, health checks, and service discovery workflows between Nomad and Consul. This closes a critical gap in test coverage for a marquee feature.

  • [ ] Create test/integration/consul_integration_test.go (or appropriate path based on existing test structure)
  • [ ] Write tests covering: service registration on job deployment, service deregistration on job stop, health check synchronization
  • [ ] Set up test fixtures with embedded Consul server or docker-compose setup
  • [ ] Document test requirements and how to run in test/README or similar
  • [ ] Add test to CI pipeline if not already automated

Add driver-specific configuration validation tests

The README mentions support for multiple task drivers (Docker, Podman, Exec, Java, QEMU). A new contributor could create a parameterized test suite validating configuration parsing and validation for each driver, ensuring invalid configs are caught early. This prevents misconfigured jobs from failing at runtime.

  • [ ] Identify existing driver config validation code (likely in client/driver directories)
  • [ ] Create test/unit/driver_config_validation_test.go with table-driven tests for each driver
  • [ ] Add test cases for: missing required fields, invalid field values, edge cases per driver
  • [ ] Document expected behavior for invalid configs in driver-specific docs or test comments
  • [ ] Run tests locally and integrate into CI

🌿Good first issues

  • Add comprehensive unit tests for the HCL job parser in jobspec/parse.go to cover edge cases like deeply nested task groups and interpolation failures—currently coverage appears sparse for error paths
  • Document the task driver plugin interface with concrete examples in drivers/base/driver.go; add a minimal 'hello world' driver example under examples/ to ease third-party driver development
  • Implement missing E2E tests for the multi-region federation feature (visible in changelog entries but no dedicated test suite visible in file list); validate leader election and gossip propagation across regions

Top contributors

Click to expand

📝Recent commits

Click to expand
  • 068227b — cli: automatically expand exec -it to -i -t (#27907) (gulducat)
  • c6bb5d1 — changelog: Add entry for #27674 (#27931) (jrasell)
  • 70ddec5 — fix(ui): SECVULN-44575 (#27928) (aklkv)
  • d594384 — upgrade Go to 1.26.3 (#27924) (tgross)
  • d2e3a9d — Remove unused parameter from scheduler/utils/setStatus (#27814) (Juanadelacuesta)
  • 505b8f5 — ci: Handle cygwin in OS type install Vault action. (#27911) (jrasell)
  • 94731ef — cli: add shutdown_delay to job init outputs. (#27900) (jrasell)
  • a4f577f — csi: fix check of StagePublishBaseDir being subdirectory of MountDir (#27717) (allisonlarson)
  • 741b896 — demo: snapshot-agent with workload-associated ACL policy (#27890) (tgross)
  • 1823e67 — drivers: include volume within mount config (#27710) (chrisroberts)

🔒Security observations

Limited Security Analysis - Insufficient Data. The provided repository snapshot contains only changelog and README files, preventing comprehensive security assessment. Based on Nomad's role as a workload orchestrator with privileged execution capabilities, critical security concerns likely exist in source code areas not provided: task driver input validation, credential/secret handling with Vault integration, API authentication/authorization, and TLS communication. Immediate actions required: (1) Enable automated dependency vulnerability scanning (govulncheck, OWASP Dependency Check), (2) Conduct focused security code review on task drivers and API handlers, (3) Verify Vault/Consul integration implements proper TLS and authentication, (4) Implement input validation for all user-controlled task configurations. Risk Level: High due to orchestrator's privileged execution context.

  • Medium · Insufficient Static Analysis Data — Repository root - all directories. The provided codebase structure contains only changelog files (.changelog directory) and README snippet. Without access to actual source code files (Go source files, configuration files, dependency manifests, Docker configurations), a comprehensive security analysis cannot be performed. Fix: Provide complete source code analysis including: Go source files, go.mod/go.sum dependency files, Docker configuration files, HCL configuration files, and any secrets management code.
  • High · Unknown Dependency Vulnerability Status — go.mod, go.sum. No dependency file content was provided (go.mod, go.sum, or similar). Unable to assess if Nomad uses vulnerable third-party libraries. Nomad integrates with Consul and Vault, requiring verification of these dependency versions. Fix: Run 'go mod tidy' and 'go list -u -m all' to identify outdated dependencies. Use 'govulncheck ./...' to detect known vulnerabilities. Implement automated dependency scanning in CI/CD pipeline.
  • High · Missing Security Configuration Review — Source code - security-related modules. Nomad is an orchestrator with native Consul and Vault integrations. Without reviewing actual implementation code, potential issues around credential handling, TLS configuration, and secure communication cannot be assessed. Fix: Conduct security code review focusing on: credential storage and retrieval, TLS/mTLS configuration, API authentication/authorization, secret handling in task drivers.
  • Medium · Potential Injection Risks in Task Drivers — Task driver implementations. Nomad supports multiple task drivers (Docker, Podman, Exec, Java, QEMU). Without source code review, cannot verify if user input is properly sanitized before execution, especially in shell command construction. Fix: Review all task driver code for proper input validation and sanitization. Use parameterized/safe execution methods rather than shell string concatenation. Implement strict input validation for user-provided task configurations.
  • Medium · Missing Infrastructure Security Context — Dockerfile, docker-compose.yml, Kubernetes manifests (if applicable). No Docker configuration, security policies, or network isolation rules were provided. Cannot assess container security posture, capabilities restriction, or resource limits. Fix: Review and implement: minimal base images, non-root container execution, read-only root filesystems, security capabilities restrictions, network policies, and resource limits.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Mixed signals · hashicorp/nomad — RepoPilot