huggingface/candle

Item: huggingface/candle
Rating: 5
Author: RepoPilot

Minimalist ML framework for Rust

Healthy

Healthy across the board

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit today
✓46+ active contributors
✓Distributed ownership (top contributor 22% of recent commits)

Show all 6 evidence items →

✓Apache-2.0 licensed
✓CI configured
✓Tests present

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/huggingface/candle)](https://repopilot.app/r/huggingface/candle)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/huggingface/candle on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: huggingface/candle

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/huggingface/candle shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across the board

Last commit today
46+ active contributors
Distributed ownership (top contributor 22% of recent commits)
Apache-2.0 licensed
CI configured
Tests present

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live huggingface/candle repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/huggingface/candle.

What it runs against: a local clone of huggingface/candle — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in huggingface/candle | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | Last commit ≤ 30 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>huggingface/candle</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of huggingface/candle. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/huggingface/candle.git
#   cd candle
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of huggingface/candle and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "huggingface/candle(\\.git)?\\b" \\
  && ok "origin remote is huggingface/candle" \\
  || miss "origin remote is not huggingface/candle (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 30 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~0d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/huggingface/candle"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Candle is a minimalist ML inference and training framework written in Rust, designed for high performance on both CPU and GPU (CUDA, Metal) with minimal dependencies. It provides tensor operations, neural network layers, and pre-built model implementations (LLaMA, Whisper, YOLO, T5) that run natively in Rust without Python overhead, enabling deployment in memory-constrained or production environments. Monorepo with 8 active workspace members: candle-core (tensor/device layer), candle-nn (neural network layers), candle-transformers (pretrained models), candle-examples (CLI demos), candle-wasm-examples/* (browser deployments), plus candle-book/ for mdBook documentation. Each crate is independently versioned but shares workspace dependencies defined in root Cargo.toml.

👥Who it's for

Rust developers and ML engineers building inference pipelines who want GPU-accelerated ML without Python/PyTorch complexity, plus teams deploying models to embedded systems, servers, or browsers (via WebAssembly). Contributors are typically systems-level Rust developers comfortable with CUDA/Metal kernel programming.

🌱Maturity & risk

Actively developed and production-ready: v0.10.2 with dual MIT/Apache-2.0 licensing, comprehensive CI (rust-ci.yml, ci_cuda.yaml, python.yml workflows), formal documentation in candle-book/, and published crate on crates.io. The monorepo shows steady maintenance with organized workspace structure and multiple integration examples.

Relatively low risk for a systems library: extensive CI coverage across platforms (CUDA, Metal, CPU), but relies on external dependencies like cudarc (0.19.4) for GPU bindings and complex metal-kernels/flash-attn submodules that are partially excluded from workspace. Rust's type system mitigates memory safety issues, though GPU kernel bugs could be hard to debug. No visible major breaking-change concerns across recent versions.

Active areas of work

Active development on GPU support (CUDA and Metal kernels in .github/workflows), Python bindings via candle-pyo3, WebAssembly support, and model zoo expansion (LLaMA variants, Phi, BLIP). CI workflows validate across CUDA and standard Rust targets; pre-commit hooks suggest quality-first development. Trufflehog scanning indicates security consciousness.

🚀Get running

Clone and build: git clone https://github.com/huggingface/candle && cd candle && cargo build. For CPU-only: cargo run --example simple_tensor. For CUDA: cargo build --features cuda. Check Makefile and .cargo/config.toml for build customizations; see candle-book/src/guide/installation.md for detailed setup.

Daily commands: CPU inference: cargo run --example mnist --release (from candle-examples). LLaMA: cargo run --example llama --release -- --model 7b-chat. CUDA: set CUDA_VERSION env var and build with --features cuda. See individual example crates and Makefile for model download and setup details.

🗺️Map of the codebase

candle-core/src/lib.rs: Core tensor abstraction and Device trait that all computation flows through
candle-core/src/tensor.rs: Tensor struct and operator implementations (matmul, add, reshape, etc.)
candle-core/src/device.rs: Device trait and CPU/CUDA/Metal backend implementations
candle-nn/src/lib.rs: Neural network layer definitions (Linear, Conv2d, LayerNorm, Embedding)
candle-transformers/src/models/: Pre-built model implementations (llama.rs, whisper.rs, yolo.rs) that users depend on
.cargo/config.toml: Build configuration for linking CUDA/Metal kernels and setting rustflags
candle-kernels/src/lib.rs: CUDA kernel code generation and compilation via nvrtc
Cargo.toml: Workspace definition and shared dependency versions for all crates

🛠️How to make changes

Core tensor ops: modify candle-core/src/ (device.rs, tensor.rs). Add new layers: candle-nn/src/lib.rs. Add pretrained models: candle-transformers/src/models/. Add examples: candle-examples/examples/. GPU kernels: candle-kernels/src/lib.rs (CUDA) or candle-metal-kernels/src/ (Metal). Documentation: candle-book/src/ uses mdBook format.

🪤Traps & gotchas

CUDA_VERSION env var must match system CUDA installation (see cudarc dynamic-linking feature). GPU kernels require nvrtc runtime compiler; offline builds may fail. candle-metal-kernels and candle-flash-attn are excluded from workspace but required for Metal/attention optimization; building with those features requires separate Cargo.toml edits. candle-onnx is also excluded and partially unmaintained. F16/F8 types require specific hardware support. WebAssembly examples need wasm32-unknown-unknown target and wasm-pack toolchain.

💡Concepts to learn

Device abstraction layer (trait-based polymorphism) — Central pattern in Candle that enables CPU/CUDA/Metal switching at runtime without recompiling tensor ops; critical for understanding portability
Eager evaluation (eager tensors) — Unlike PyTorch's optional autograd graph, Candle computes immediately; affects how you structure inference loops and debugging
CUDA Just-In-Time (JIT) Compilation (NVRTC) — Candle compiles CUDA kernels at runtime via NVIDIA Runtime Compiler; understanding this is essential for custom kernel development and troubleshooting build failures
Memory layout and strides (row-major vs. column-major) — Tensor layouts affect GEMM performance and GPU memory coalescing; Candle's stride-based indexing is visible in reshape/permute operations
Half-precision (F16, F8) quantization — Candle supports subword dtypes via the half crate for memory/bandwidth reduction; critical for large model inference on constrained GPUs
Transformer architecture (attention, embeddings, MLPs) — Most pretrained models in candle-transformers follow this; understanding multi-head attention and layer norms is essential for debugging model inference
WebAssembly (WASM) compilation and DOM bindings — candle-wasm-examples/* deploy Candle to browsers; requires understanding wasm32-unknown-unknown target and web-sys/wasm-bindgen FFI

PyO3/pyo3 — Powers candle-pyo3 Python bindings; understanding FFI is essential for extending Python integration
huggingface/safetensors — Tensor serialization format used throughout Candle for model weights; cross-referenced in dependencies
huggingface/transformers — Inspiration for model architectures and tokenizer APIs that Candle reimplements in Rust
tinygrad/tinygrad — Spiritual alternative: minimalist ML framework in Python with similar philosophy; useful for comparing design decisions
NVIDIA/cutlass — Underlying GPU kernel patterns and GEMM optimizations that candle-kernels build upon

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive CUDA kernel documentation and examples in candle-book

The repo has candle-book/src/cuda/writing.md for writing CUDA kernels, but there's no corresponding documentation for the candle-kernels and candle-metal-kernels excluded packages. New contributors struggle to understand how to contribute optimized kernels. Add a detailed guide with concrete examples showing how to write, test, and benchmark custom CUDA kernels against the existing kernel implementations in candle-core/benches/benchmarks/.

[ ] Create candle-book/src/cuda/custom_kernels.md with step-by-step kernel development workflow
[ ] Add code examples showing before/after kernel optimization using benchmarks from candle-core/benches/benchmarks/
[ ] Document the build process for candle-kernels (currently excluded from workspace)
[ ] Add reference to existing kernel implementations in candle-core for pattern matching
[ ] Include debugging tips using CUDA compute-sanitizer and profiling tools

Implement missing type-specific operation tests for candle-core

The candle-core/benches/ directory has performance benchmarks for operations like matmul, conv_transpose2d, and qmatmul, but there's no evidence of comprehensive unit tests covering f16, f8, and quantized dtypes (referenced in cudarc features). Add systematic unit tests that verify numerical correctness across all supported dtypes to catch regressions before they reach production.

[ ] Create candle-core/tests/dtype_ops_test.rs with parametric tests for f32, f16, f8, bf16 across basic ops
[ ] Add quantization-specific tests (qmatmul, qlinear) validating precision loss is within acceptable bounds
[ ] Test broadcasting behavior consistency across all dtypes (referenced in candle-core/benches/benchmarks/broadcast.rs)
[ ] Integrate tests into .github/workflows/rust-ci.yml to run on each commit
[ ] Document expected precision tolerances per dtype in test comments for maintainability

Add Python API integration tests and examples in candle-pyo3

The candle-pyo3 package exists in the workspace but has no visible examples or integration tests in candle-examples/. Python users are a significant audience (evidenced by the maturin.yml workflow), yet there's no clear pattern showing how to use Candle from Python for common ML workflows. Add concrete end-to-end examples that bridge the gap between Python users and Rust internals.

[ ] Create candle-examples/python_integration/ with examples: tensor_creation.py, model_inference.py, training_loop.py
[ ] Add candle-pyo3 tests in .github/workflows/python.yml validating Python<->Rust interop for tensor operations
[ ] Document dtype conversions and memory layout assumptions in candle-book/src/apps/
[ ] Add performance comparison notebook showing when to use Python bindings vs pure Rust
[ ] Include troubleshooting guide for common PyO3 issues (reference counting, GIL interactions)

🌿Good first issues

Add comprehensive unit tests for candle-core/src/tensor.rs reshape and broadcasting edge cases (currently examples test these, but no explicit test suite visible)
Complete documentation in candle-book/src/guide/ for the Metal backend setup and performance tuning (currently CUDA-focused; Metal is under-documented relative to its code size)
Implement missing operators (e.g., scatter, gather, advanced slicing) in candle-core and add examples in candle-examples/ to match PyTorch API coverage

⭐Top contributors

Click to expand

@ivarflakstad — 22 commits
@EricLBuehler — 13 commits
@paulinebm — 5 commits
@DrJesseGlass — 4 commits
@guoqingbao — 4 commits

📝Recent commits

Click to expand

5447a87 — Optimization for CPU Causal Flash Attention (integrated into Qwen3) (#3254) (DrJesseGlass)
5bd5618 — Remove unwrap()s from candle-metal-kernels/src/metal/device.rs (#3382) (jacobgorm)
b43326e — fix: candle-book paths (#3386) (IamPhytan)
7fa19d2 — Remove unnecessary task (#1925) (kejcao)
9e71760 — Extend GradStore public functionality (#1483) (agerasev)
2bbbb4e — Add rustup wasm doc to wasm example (#1438) (jk2K)
c8c7663 — update erf::polynomial (#1413) (chris-ha458)
bb2d400 — fix clippy lints surfaced by Rust 1.95 (#3481) (tomsanbear)
c5d7d49 — Fix: use MTLCopyAllDevices() for reliable Metal device enumeration (#3449) (romnn)
cce7901 — Fix fmt and clippy remnants (#3472) (ivarflakstad)

🔒Security observations

The Candle ML framework demonstrates generally good security practices with dual licensing (MIT/Apache 2.0), organized workspace structure, and CI/CD security scanning (trufflehog workflow). However, there are notable concerns: incomplete dependency specifications that could lead to version drift, dynamic CUDA linking that introduces runtime library substitution risks, and build-system-dependent version detection that could be manipulated. The codebase lacks visible hardcoded secrets and injection vulnerabilities in the examined file structure. Primary recommendations include completing all dependency specifications, implementing stricter CUDA library validation for production deployments, and consolidating workspace management for better dependency governance. Regular execution of 'cargo audit' and security scanning tools is essential given the framework's role in ML/AI applications handling potentially sensitive data.

Medium · Incomplete Dependency Version Specification — Cargo.toml - [workspace.dependencies] section. The workspace dependencies in Cargo.toml show an incomplete entry for the 'half' crate with version specification cut off ('version'). This could lead to unpinned or unexpected versions being pulled, potentially including versions with known vulnerabilities. Fix: Complete and verify all dependency version specifications. Ensure all dependencies have explicit version constraints and run 'cargo audit' regularly to detect known vulnerabilities.
Medium · Dynamic CUDA Linking Enabled — Cargo.toml - cudarc dependency configuration. The cudarc dependency is configured with 'dynamic-linking' feature enabled. This allows the framework to dynamically link to CUDA libraries at runtime, which could expose the application to library substitution attacks or version mismatch vulnerabilities if CUDA libraries are not properly secured on the deployment system. Fix: Document CUDA library security requirements for production deployments. Verify CUDA library integrity through checksums and ensure deployment systems maintain secure CUDA library paths. Consider static linking for security-critical deployments.
Low · Cryptographic Feature Detection from Build System — Cargo.toml - cudarc dependency with 'cuda-version-from-build-system' feature. The cudarc dependency uses 'cuda-version-from-build-system' feature, which determines CUDA version from build-time environment. Mismatched or manipulated build-time environment could lead to runtime incompatibilities or exploitation of version-specific vulnerabilities. Fix: Implement build environment verification and validation. Document required CUDA versions clearly. Consider pinning CUDA versions explicitly rather than detecting from build system.
Low · Workspace Excluded Directories Not in Cargo Lockfile Management — Cargo.toml - 'exclude' section and workspace structure. Several packages are excluded from the workspace (candle-flash-attn, candle-metal-kernels, candle-onnx, etc.) but are included as dependencies. This creates potential for version inconsistencies and makes dependency auditing more complex. Fix: Either include excluded packages in the workspace for unified version management, or maintain separate security governance for excluded packages. Ensure all excluded dependencies are regularly audited.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

huggingface/candle

Embed the "Healthy" badge

Onboarding doc

Onboarding: huggingface/candle

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🛠️How to make changes

🪤Traps & gotchas

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add comprehensive CUDA kernel documentation and examples in candle-book

Implement missing type-specific operation tests for candle-core

Add Python API integration tests and examples in candle-pyo3

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next