autumnai/leaf

Item: autumnai/leaf
Rating: 5
Author: RepoPilot

Open Machine Intelligence Framework for Hackers. (GPU/CPU)

Healthy

Healthy across all four use cases

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓12 active contributors
✓Distributed ownership (top contributor 34% of recent commits)
✓Apache-2.0 licensed

Show all 6 evidence items →

✓CI configured
✓Tests present
⚠Stale — last commit 2y ago

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/autumnai/leaf)](https://repopilot.app/r/autumnai/leaf)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/autumnai/leaf on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: autumnai/leaf

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/autumnai/leaf shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

12 active contributors
Distributed ownership (top contributor 34% of recent commits)
Apache-2.0 licensed
CI configured
Tests present
⚠ Stale — last commit 2y ago

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live autumnai/leaf repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/autumnai/leaf.

What it runs against: a local clone of autumnai/leaf — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in autumnai/leaf | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 809 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>autumnai/leaf</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of autumnai/leaf. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/autumnai/leaf.git
#   cd leaf
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of autumnai/leaf and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "autumnai/leaf(\\.git)?\\b" \\
  && ok "origin remote is autumnai/leaf" \\
  || miss "origin remote is not autumnai/leaf (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "src/lib.rs" \\
  && ok "src/lib.rs" \\
  || miss "missing critical file: src/lib.rs"
test -f "src/layer.rs" \\
  && ok "src/layer.rs" \\
  || miss "missing critical file: src/layer.rs"
test -f "src/layers/mod.rs" \\
  && ok "src/layers/mod.rs" \\
  || miss "missing critical file: src/layers/mod.rs"
test -f "src/solvers/mod.rs" \\
  && ok "src/solvers/mod.rs" \\
  || miss "missing critical file: src/solvers/mod.rs"
test -f "Cargo.toml" \\
  && ok "Cargo.toml" \\
  || miss "missing critical file: Cargo.toml"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 809 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~779d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/autumnai/leaf"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Leaf is a Rust-based machine learning framework that enables building classical, deep, and hybrid ML applications with a focus on modularity, performance, and hardware portability. It abstracts GPU/CPU computation through Collenchyma (a backend-agnostic compute abstraction layer) and supports CUDA, OpenCL, and native CPU execution, positioning itself as one of the fastest ML frameworks available while maintaining a minimal API surface. Monolithic crate structure: src/ (inferred, not listed but typical for Cargo projects) contains the framework core; benches/network_benches.rs provides performance testing; doc/src/ holds the mdBook documentation source; capnp/leaf.capnp defines a Cap'n Proto serialization schema for network distribution. build.rs compiles the capnp schema at build time. Feature flags in Cargo.toml enable conditional compilation of hardware backends.

👥Who it's for

Rust developers and ML engineers building production or research machine learning systems who need cross-device portability (CPU/GPU/FPGA), strong type safety, and minimal technical debt. Also targets hackers and researchers wanting to publish modular ML components (reinforcement learning, preprocessing, deployment) independently within the Autumn platform ecosystem.

🌱Maturity & risk

Early-stage but architecturally sound (v0.2.1, several months old at time of repo snapshot). Has CI via Travis, compiled benchmarks in benches/network_benches.rs, and comprehensive documentation in doc/book/. However, the README contains a prominent disclaimer: 'Leaf is currently in an early stage of development'—it is actively developed but not production-proven at scale. Suitable for research and experimental work; enterprise adoption would require additional hardening.

Moderate risk: tight coupling to Collenchyma (collenchyma 0.0.8, collenchyma-blas 0.2.0, collenchyma-nn 0.3.2) and its ecosystem stability is unclear from repo alone. Small feature-flag matrix (native/cuda/opencl) may hide platform-specific bugs. No visible test directory in the top 60 files suggests test coverage may be sparse. Early-stage status means API breaking changes are likely (license is MIT OR Apache-2.0, permissive but provides no stability guarantee).

Active areas of work

Framework is in active development with Collenchyma integration as the primary backend abstraction. CHANGELOG.md and RELEASE.md exist, suggesting semantic versioning and release discipline. The Cap'n Proto schema (capnp/leaf.capnp) indicates work on distributed optimization and network serialization (mentioned in doc/src/distributed-optimization.md). Travis CI configured (.travis.yml) for continuous testing.

🚀Get running

git clone https://github.com/autumnai/leaf.git
cd leaf
cargo build --features native
cargo test
cargo doc --open

For GPU support, substitute --features cuda or --features opencl. See FEATURE-FLAGS.md for all backend options.

Daily commands: Development with native backend: cargo build --features native && cargo test. With GPU: cargo build --features cuda or cargo build --features opencl. Benchmarks: cargo bench. Documentation: cargo doc --open or read prebuilt HTML in doc/book/index.html. Examples: See referenced leaf-examples repository for executable models.

🗺️Map of the codebase

src/lib.rs — Main library entry point that exports all public APIs and re-exports core components like layers, solvers, and utilities.
src/layer.rs — Defines the core Layer trait and lifecycle that all neural network layers must implement; foundational abstraction for the framework.
src/layers/mod.rs — Module coordinator for all layer types (activation, common, container, loss, utility) that users combine to build networks.
src/solvers/mod.rs — Root solver module exporting optimization strategies (SGD, momentum) used to train neural network layers.
Cargo.toml — Dependency manifest declaring collenchyma backends (BLAS, NN) and conditional compilation for CPU/GPU acceleration.
src/layers/container/sequential.rs — Sequential layer container that chains layers together; the primary way users compose networks in Leaf.

🧩Components & responsibilities

Layer trait & implementations (Rust traits, collenchyma-nn) — Define neural network layer interface (forward, backward) and provide common activation, pooling, convolution, loss layers
- Failure mode: Incorrect gradient computation or shape mismatches cause NaN loss or silent numerical errors during training
Sequential container (Rust Vec, trait objects) — Chains layers sequentially, delegating forward/backward passes in order
- Failure mode: Output shape mismatch between consecutive layers causes panic or silent incorrect computation if shapes bypass checks
Solver (SGD/Momentum) (Collenchyma BLAS (axpy, scal)) — Updates network weights via gradient descent using learning rate and optional momentum
- Failure mode: Wrong learning rate or momentum coefficient causes divergence, oscillation, or stagnation
Collenchyma backend (BLAS (OpenBLAS, Intel MKL), CUDA, OpenCL) — Abstracts CPU (BLAS) and GPU (CUDA/OpenCL) tensor operations; manages device memory allocation and transfer
- Failure mode: Out-of-memory, device allocation failure,

🛠️How to make changes

Add a new activation layer

Create new file in src/layers/activation/ (e.g., elu.rs) with struct implementing the Layer trait (src/layers/activation/elu.rs)
Implement Layer::forward() and Layer::backward() using collenchyma-nn operations (src/layers/activation/elu.rs)
Export the new layer in src/layers/activation/mod.rs via pub mod elu; pub use self::elu::*; (src/layers/activation/mod.rs)
Add unit tests to tests/layer_specs.rs to verify forward/backward pass gradients (tests/layer_specs.rs)

Add a new optimizer (solver)

Create solver struct in src/solvers/sgd/adagrad.rs (or new file) implementing required update rules (src/solvers/sgd/adagrad.rs)
Integrate with training loop by updating solver configuration in user code (no framework changes needed if using existing trait) (examples/benchmarks.rs)
Add benchmark in benches/network_benches.rs to measure convergence and wall-clock time vs. SGD (benches/network_benches.rs)

Build and train a multi-layer neural network

Create sequential container by importing Sequential from src/layers/container/sequential.rs (examples/benchmarks.rs)
Stack layers (Linear, Activation, etc.) into the Sequential container (examples/benchmarks.rs)
Instantiate an SGD or Momentum solver from src/solvers/sgd/ and call forward/backward passes on batched data (examples/benchmarks.rs)
Optionally serialize trained weights using Cap'n Proto schema in capnp/leaf.capnp (capnp/leaf.capnp)

🔧Why these technologies

Rust — Memory safety without garbage collection enables fast numerical compute; Rust's type system prevents common deep-learning bugs (NaN propagation, shape mismatches)
Collenchyma + Collenchyma-BLAS/NN — Hardware-agnostic abstraction layer providing unified CPU (native BLAS) and GPU (CUDA/OpenCL via collenchyma) execution without framework rewrite
Cap'n Proto (capnp) — Efficient binary serialization for trained model weights with minimal deserialization overhead; supports zero-copy reading
Trait-based architecture (Layer trait) — Composable abstractions allowing users to implement custom layers without modifying core framework

⚖️Trade-offs already made

Monolithic Sequential container over dynamic computation graph (DAG)
- Why: Simpler API for linear topologies; easier to reason about memory and execution order
- Consequence: Cannot express residual connections, multi-input layers, or arbitrary DAGs without wrapper structs; less flexible than TensorFlow/PyTorch
Eager execution (immediate forward/backward) vs. lazy graph building
- Why: Reduces abstraction overhead and enables straightforward debugging
- Consequence: Higher per-operation dispatch cost; less opportunity for graph-level optimizations (fusion, reordering)
Immutable Layer trait with owned tensor passing
- Why: Prevents aliasing bugs and simplifies gradient computation reasoning
- Consequence: More allocations; higher memory churn than in-place operations

🚫Non-goals (don't propose these)

Distributed training across multiple machines (only single-device or multi-GPU on one host via solver configuration)
Dynamic shape inference or symbolic differentiation (shapes must be known at layer instantiation)
High-level APIs for common tasks like image classification pipelines (no pre-built ResNet, VGG; users compose manually)
Production serving / model deployment tooling (focus is research/experimentation)

🪤Traps & gotchas

Collenchyma version pinning: Using collenchyma 0.0.8 (pre-1.0, unstable API); breaking changes in upstream may require immediate crate updates. 2. No visible src/ directory in file list: Standard Cargo project structure means source is in src/ at root, but the listing may be incomplete—verify locally. 3. Cap'n Proto schema regeneration: If you modify capnp/leaf.capnp, build.rs must run successfully; missing capnpc or build-time failures can leave stale generated code. 4. Feature flag interaction: native/cuda/opencl are mutually exclusive in common use; building with multiple enabled may cause linking conflicts. 5. GPU backend requirements: CUDA/OpenCL builds require system libraries (CUDA Toolkit, GPU drivers, OpenCL headers) not installed by Cargo—external setup required.

🏗️Architecture

💡Concepts to learn

Backend abstraction layer (Collenchyma) — Leaf's core innovation: Collenchyma abstracts GPU/CPU/FPGA compute so you write once and run on any hardware (CUDA, OpenCL, native)—critical for understanding Leaf's portability promise.
Cap'n Proto zero-copy serialization — capnp/leaf.capnp schema enables fast, zero-copy network transfer of models and tensors for distributed training—essential for multi-device optimization (doc/src/distributed-optimization.md).
Modular layer system with lifecycle management — Leaf's architecture emphasizes independent layer modules with well-defined lifespans (doc/src/layer-lifecycle.md); understanding layer trait design is crucial for extending the framework.
Feature-gated conditional compilation (Rust) — Leaf uses Cargo feature flags (native/cuda/opencl) to compile different backend code paths; you must understand Rust's #[cfg] and feature mechanics to enable GPU support.
Computational graph / dataflow model — Though not explicitly named in the file list, Leaf's 'building-networks.md' doc suggests a dataflow graph execution model common to all modern ML frameworks—networks define computation DAGs.
Solver / optimization algorithm abstraction — doc/src/solvers.md indicates Leaf abstracts gradient descent variants (SGD, Adam, etc.) as pluggable solvers—understanding this separation enables custom optimizers.
Type-safe tensor and shape handling (Rust type system) — Leaf leverages Rust's compile-time type checking to prevent shape/dtype mismatches at build time rather than runtime—critical advantage over dynamically-typed frameworks.

autumnai/collenchyma — Direct upstream dependency providing the GPU/CPU compute abstraction layer that Leaf builds on—understanding Collenchyma's backend system is essential.
autumnai/cuticula — Companion project mentioned in README for automated preprocessing pipelines; integrates with Leaf for end-to-end ML workflows.
tensorflow/tensorflow — Primary inspiration and alternative framework; TensorFlow's design influenced Leaf's API simplicity and modularity goals.
pytorch/pytorch — Competing dynamic ML framework in Rust ecosystem; PyTorch's eager execution and layer abstractions informed Leaf's architecture.
autumnai/leaf-examples — Official examples repository referenced in README containing runnable ML models and CLI tooling demonstrating Leaf usage patterns.

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for layer lifecycle and layer.rs module

The repo has extensive documentation in doc/src/layer-lifecycle.md and doc/src/layers.md describing layer abstractions, but there are no visible test files in src/ for the core layer.rs module. Given that layers are fundamental to the framework, adding unit tests would improve reliability and serve as executable documentation. This is critical for a ML framework where correctness is paramount.

[ ] Create src/layer_tests.rs or add #[cfg(test)] mod tests to src/layer.rs
[ ] Write tests covering layer initialization, forward/backward passes, and parameter updates as described in doc/src/layer-lifecycle.md
[ ] Add integration tests in tests/ directory that validate layer composition in networks (referenced in doc/src/building-networks.md)
[ ] Ensure tests cover both native and GPU backends using feature flags

Add feature-gated backend tests to validate CUDA/OpenCL compilation and runtime

The Cargo.toml defines cuda and opencl features, but there's no visible CI configuration validating these backends actually compile and run correctly. The .travis.yml exists but specifics aren't shown. Adding backend-specific tests in examples/benchmarks.rs or a new tests/backend_tests.rs would catch regressions and help contributors test GPU features locally.

[ ] Create tests/backend_tests.rs with #[cfg(feature = "cuda")] and #[cfg(feature = "opencl")] gated tests
[ ] Add smoke tests that allocate tensors and run basic operations on each backend
[ ] Update .travis.yml to run tests with --features cuda and --features opencl (if hardware available, or skip gracefully)
[ ] Document in CONTRIBUTING.md how to run backend-specific tests locally

Create comprehensive capnp schema documentation and serialization round-trip tests

The repo uses Cap'n Proto (capnp/leaf.capnp) for serialization but there's no visible documentation or tests validating the schema. The src/capnp_util.rs exists but without tests. Adding schema documentation and round-trip serialization tests would ensure data integrity for distributed optimization (mentioned in doc/src/distributed-optimization.md) and model persistence.

[ ] Add comments and documentation to capnp/leaf.capnp explaining each message type and field purpose
[ ] Create tests/capnp_serialization_tests.rs with round-trip tests (serialize→deserialize→verify) for all major types (Layers, Networks, Solvers)
[ ] Add version compatibility tests to capnp_util.rs to catch breaking schema changes
[ ] Document serialization format in doc/src or as doc comments in capnp_util.rs for users building custom serialization

🌿Good first issues

Add integration tests for layer-lifecycle.md workflow: Create a test in tests/ that instantiates a network as documented, trains it, and serializes it via capnp/leaf.capnp. Currently no tests/ directory visible.
Expand FEATURE-FLAGS.md with build troubleshooting section: Document common CUDA/OpenCL linking errors and their fixes (e.g., CUDA_PATH, LD_LIBRARY_PATH). Likely many users struggle with GPU setup.
Convert doc/book/ prebuilt HTML to generated output: doc/book/ is checked in as static HTML, but doc/src/ has mdBook markdown. Set up CI to auto-generate and verify consistency between source and output, or clarify the intended documentation flow.

⭐Top contributors

Click to expand

@homu — 34 commits
@hobofan — 33 commits
@MichaelHirn — 20 commits
@alexandermorozov — 5 commits
@radarhere — 1 commits

📝Recent commits