rui314/mold

Item: rui314/mold
Rating: 5
Author: RepoPilot

mold: A Modern Linker 🦠

Healthy

Healthy across all four use cases

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 1d ago
✓11 active contributors
✓MIT licensed

Show 3 more →

✓CI configured
✓Tests present
⚠Single-maintainer risk — top contributor 82% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/rui314/mold)](https://repopilot.app/r/rui314/mold)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/rui314/mold on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: rui314/mold

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/rui314/mold shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

Last commit 1d ago
11 active contributors
MIT licensed
CI configured
Tests present
⚠ Single-maintainer risk — top contributor 82% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live rui314/mold repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/rui314/mold.

What it runs against: a local clone of rui314/mold — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in rui314/mold | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 31 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>rui314/mold</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of rui314/mold. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/rui314/mold.git
#   cd mold
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of rui314/mold and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "rui314/mold(\\.git)?\\b" \\
  && ok "origin remote is rui314/mold" \\
  || miss "origin remote is not rui314/mold (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "src/main.cc" \\
  && ok "src/main.cc" \\
  || miss "missing critical file: src/main.cc"
test -f "src/mold.h" \\
  && ok "src/mold.h" \\
  || miss "missing critical file: src/mold.h"
test -f "src/elf.cc" \\
  && ok "src/elf.cc" \\
  || miss "missing critical file: src/elf.cc"
test -f "src/passes.cc" \\
  && ok "src/passes.cc" \\
  || miss "missing critical file: src/passes.cc"
test -f "CMakeLists.txt" \\
  && ok "CMakeLists.txt" \\
  || miss "missing critical file: CMakeLists.txt"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 31 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~1d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/rui314/mold"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

mold is a high-performance ELF linker written in C++20 that replaces GNU ld, GNU gold, and LLVM lld. It combines object files into executables and shared libraries 3-10× faster than alternatives (MySQL 8.3: 0.46s vs lld's 1.64s, Clang 19: 1.35s vs lld's 5.20s) while maintaining full ELF specification compliance across 16+ CPU architectures (x86-64, ARM64, ARM32, RISC-V, PowerPC, LoongArch, m68k, SPARC64, SH-4, s390x). Monolithic C++ binary: src/arch-*.cc contains architecture-specific code generation (relocations, calling conventions), lib/ contains reusable utilities (compression, hashing, globbing), and the main linker logic sits in src/ alongside CMakeLists.txt. Build system is CMake-based with cross-compilation support via install-cross-tools.sh. GitHub Actions workflows in .github/workflows/ automate testing for native and cross-compilation targets.

👥Who it's for

Build system maintainers, C/C++/Rust compiler users, and embedded systems developers who need to minimize debug-edit-rebuild cycles on large codebases (Chromium, Clang, MySQL scale). Contributors are typically systems programmers familiar with ELF format, binary generation, and multi-architecture compilation.

🌱Maturity & risk

Production-ready and actively maintained: the repo has comprehensive CI/CD via GitHub Actions (build-all.yml, ci.yml, release-assets.yml), extensive architecture-specific implementations (arch-*.cc files for every major ISA), memory sanitizer testing, and active releases. The codebase is ~1.1M lines of C++ with established packaging across major Linux distributions.

Low risk for linker functionality, but high risk of introducing subtle bugs: linker correctness is binary (works or breaks linking entirely), and testing must cover 16+ architectures. Single primary maintainer (rui314) is a concentration risk. Dependencies appear minimal (lib/ contains mostly self-contained utilities), reducing supply-chain risk. Watch for ABI-breaking changes across major versions given the nature of object file format handling.

Active areas of work

Active development visible in workflows and architecture support: the repo includes build automation for multiple platforms (build-all.yml, build-native.yml, run-msan.sh), ongoing manpage updates (update-manpage.yml), and asset releases (release-assets.yml). The presence of install-cross-tools.sh and multiple arch-*.cc files suggests ongoing work on architecture support parity.

🚀Get running

Clone, install build deps, compile with CMake:

git clone --branch stable https://github.com/rui314/mold.git
cd mold
./install-build-deps.sh
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=c++ -B build
cmake --build build -j$(nproc)
sudo cmake --build build --target install

Requires GCC 10.2+ or Clang 16.0.0+, and libstdc++10 or libc++7.

Daily commands: mold is a command-line tool, not a service:

mold [options] file.o ...
# Example: link a binary
mold -o executable main.o lib.o -lc
# Use as drop-in replacement
CC=mold gcc -o program main.c
# Or via LD environment variable
LD=mold cc -o program main.c

See docs/mold.md for full CLI reference.

🗺️Map of the codebase

src/main.cc — Entry point for mold linker; all linking workflows start here and route to architecture-specific implementations
src/mold.h — Core header defining Context and Symbol abstractions; every linker pass depends on these definitions
src/elf.cc — ELF file parsing and writing logic; handles input/output format conversions critical to correctness
src/passes.cc — Main linker passes orchestration (resolve symbols, apply relocations, GC); the heart of link-time transformations
CMakeLists.txt — Build configuration supporting multi-architecture cross-compilation; required to onboard new platforms or features
src/input-sections.cc — Input section merging and layout logic; critical for output generation and performance optimization
lib/lib.h — Utility library header (compression, hashing, data structures); foundational for performance-critical code

🛠️How to make changes

Add support for a new CPU architecture

Create new architecture file src/arch-YOURARCH.cc with relocation handlers, ABI rules, and thunk generation following the pattern in src/arch-arm64.cc (src/arch-YOURARCH.cc)
Define architecture class inheriting from Target in src/mold.h (Context constructor expects target to be set via -m YOURARCH) (src/mold.h)
Register new architecture in src/main.cc by adding condition in target initialization logic to instantiate your Target subclass (src/main.cc)
Add CMake build target and test suite configuration in CMakeLists.txt (CMakeLists.txt)
Add test cases in test/ directory following naming pattern test/arch-YOURARCH-*.sh (test/CMakeLists.txt)

Add a new linker optimization pass

Implement optimization as new function in src/passes.cc or separate .cc file (e.g., src/my-optimization.cc) following the pattern of gc_sections() or icf() (src/passes.cc)
Define command-line flag in src/cmdline.cc to control the optimization (e.g., --my-optimization) (src/cmdline.cc)
Call your optimization function from main linker pipeline in src/main.cc or src/passes.cc in the appropriate phase order (src/main.cc)
Add test cases in test/ to validate correctness across architectures (test/)

Extend ELF output format support (e.g., new section type)

Define output chunk class in src/output-chunks.cc, inheriting from OutputChunk and implementing write() for your new section format (src/output-chunks.cc)
Update ELF header/section table logic in src/elf.cc to parse and emit the new section type correctly (src/elf.cc)
Modify src/input-sections.cc to recognize and collect input sections of the new type during merge phase (src/input-sections.cc)
Add integration test in test/ to verify round-trip reading and writing of the new section (test/)

🔧Why these technologies

C++17 with parallel for loops (pthreads/OpenMP) — Enables multi-threaded input parsing and section merging; mold's speed advantage comes from parallelizing work across CPU cores during merge and GC phases
Memory-mapped I/O (mmap) for input and output — Minimizes data copies; sections can be read and written directly from/to file without intermediate buffers, critical for multi-gigabyte linking
Separate architecture modules (arch-*.cc) — Isolates target-specific relocation logic and ABI rules; allows independent testing and porting to new ISAs without destabilizing core linker
Single-pass symbol resolution with lazy binding — Avoids multiple passes over symbol tables; undefined symbols are only resolved at final symbol table emission, reducing overhead

⚖️Trade-offs already made

Aggressive multi-threading during input parsing and merging
- Why: Parallelism scales link time with core count (16-32 cores typical on modern systems)
- Consequence: Requires thread-safe hash tables and atomic operations; synchronization overhead minimal due to coarse-grained locking on distinct symbol/section groups
In-memory symbol and section representation before output serialization
- Why: Allows multiple
- Consequence: undefined

🪤Traps & gotchas

Several non-obvious gotchas: (1) Cross-compilation requires install-cross-tools.sh; simply setting CMAKE_CXX_COMPILER may not pull in correct target libraries. (2) Memory safety tested via run-msan.sh; undefined behavior in linking edge cases can cause silent mislinks, not crashes — requires rigorous testing. (3) ELF format variants (ABI, little/big-endian, PIE vs static) have subtle interactions in relocation logic; arch-specific code must handle all combinations. (4) CMake config sets C++20 features; older toolchains may silently fail to build architecture files. (5) GitHub Actions workflows use custom install-extras.sh; local build may need manual dependency setup on non-standard distros.

🏗️Architecture

💡Concepts to learn

ELF (Executable and Linkable Format) — mold's entire purpose is generating correct ELF files; understanding sections (.text, .data, .symtab), segments (PT_LOAD, PT_DYNAMIC), and symbol resolution is fundamental to contributing
Relocation Records — Each arch-*.cc file implements relocation handling (R_X86_64_PC32, R_AARCH64_ABS64, etc.); understanding how the linker patches addresses in object files is core to any linking algorithm
Position Independent Code (PIE) and ASLR — Modern binaries are PIE by default; mold must emit relocations and GOT/PLT entries compatible with kernel address space layout randomization, affecting relocation strategies per arch
Symbol Interposition and Weak Symbols — mold's symbol resolution must handle weak symbols, global overrides, and library symbol precedence; incorrect resolution causes silent functional bugs in linked binaries
Global Offset Table (GOT) and Procedure Linkage Table (PLT) — Dynamic linking requires GOT (indirect data access) and PLT (indirect function calls); each architecture has different GOT/PLT layouts that arch-*.cc must generate correctly
Link-Time Optimization (LTO) and Thin-LTO — mold must consume LLVM IR objects and coordinate with compiler; understanding LTO plugin protocol is needed for full compiler integration
Memory-Mapped I/O and Parallel Processing — mold's speed derives from parallelizing independent linking phases and memory-mapping large object files; lib/ utilities (atomics.h, bitvector.h) enable lock-free data structures

llvm/llvm-project — LLVM's lld is the second-fastest linker and closest direct competitor; mold frequently benchmarked against it; developers often switch between both
bminor/binutils-gdb — GNU ld and gold are the legacy baseline linkers; mold maintains compatibility with their command-line interface and ELF output format
rui314/elf2hashes — By same author; complementary tool for analyzing and optimizing ELF binaries produced by mold
torvalds/linux — mold must understand Linux kernel's ELF loading expectations; kernel source defines ABI contracts that mold implements
gcc-mirror/gcc — Primary compiler frontend that invokes mold; GCC's driver orchestrates mold's execution and defines linker script expectations

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for architecture-specific relocation handling (src/arch-*.cc)

The repo supports 10+ architectures (ARM32, ARM64, i386, LoongArch, M68k, PPC32, PPC64v1, PPC64v2, RISC-V, s390x, SH4, Sparc64, x86-64) with separate relocation logic in src/arch-*.cc files. There is no visible test coverage for relocation edge cases across these architectures. Adding targeted tests would catch regressions early, especially for lesser-tested architectures like LoongArch and M68k where contributors have limited hardware access.

[ ] Create a test directory structure (e.g., test/relocation/) with architecture-specific test cases
[ ] Implement tests for each arch-*.cc file focusing on: GOT relocations, PLT entries, TLS relocations, and PC-relative addressing
[ ] Add CI workflow step in .github/workflows/ci.yml to run architecture-specific relocation tests with QEMU for cross-arch validation
[ ] Document test patterns in docs/coding-guidelines.md for future contributors

Implement link-time optimization (LTO) coverage tracking and CI validation for src/lto-unix.cc

The repo has src/lto-unix.cc and references to LTO in workflows (run-msan.sh), but there's no dedicated CI job that validates LTO builds across different scenarios. Given mold's focus on speed, demonstrating that LTO integration works reliably and measures performance impact would be valuable. Current .github/workflows/build-all.yml and ci.yml don't explicitly test LTO linking scenarios.

[ ] Add a new GitHub Actions workflow .github/workflows/test-lto.yml that builds test programs with -flto and links them with mold
[ ] Create test cases in a new test/lto/ directory covering: LTO+mold linking, mixed LTO/non-LTO object files, and LTO with different optimization levels
[ ] Extend src/lto-unix.cc with detailed logging/metrics to report plugin interaction performance
[ ] Document LTO usage patterns and limitations in docs/design.md or a new docs/lto.md file

Add cross-compilation validation tests for install-cross-tools.sh and src/filetype.cc

The repo has install-cross-tools.sh and multiple architecture support files, but there's no CI job that validates cross-compilation scenarios (e.g., building x86-64 mold binaries for ARM64 targets). The .github/workflows/ directory lacks a dedicated cross-compilation test. This is critical since mold targets multiple architectures and contributors need confidence that cross-compilation works correctly.

[ ] Create a new GitHub Actions workflow .github/workflows/test-cross-compile.yml that runs on a large runner
[ ] Add test cases in test/cross-compile/ validating: native→ARM64, native→RISC-V, native→PPC64, linking simple test programs with cross-compiled mold
[ ] Extend install-cross-tools.sh with validation checks and error handling, document expected output in README.md's setup section
[ ] Verify src/filetype.cc correctly identifies target ELF headers during cross-compilation by adding unit tests

🌿Good first issues

Add comprehensive link-time diagnostics for debug builds: extend docs/bugs.md with reproducible minimal examples (1KB .o files) for each common mislink scenario, then add --verbose-relocs flag to src/ to dump relocation calculations for debugging: Users debugging linker failures need concrete examples; currently gaps exist for edge cases like weak symbol resolution
Benchmark and optimize hot path in one arch-*.cc file: profile existing arch-arm64.cc or arch-x86-64.cc with perf, identify relocation handler bottleneck (likely hash table lookups in symbol table), add fast path for common cases (function relocations): mold's selling point is speed; micro-optimizations in per-relocation code paths directly improve real link times by 5-15%
Expand test coverage for LoongArch in lib/gentoo-test.sh and add corresponding CI step in .github/workflows/ci.yml: create test suite for 64-bit/32-bit and little/big-endian LoongArch variants (currently arch-loongarch.cc exists but tests are sparse): LoongArch is emerging ISA in mold's multi-arch support; test blind spots risk silent regressions when dependencies update

⭐Top contributors

Click to expand

@rui314 — 82 commits
@tobim — 6 commits
@koachan — 3 commits
@pierluigilenoci — 2 commits
@bamo — 1 commits

📝Recent commits