microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Healthy across the board
Permissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit today
- ✓38+ active contributors
- ✓Distributed ownership (top contributor 14% of recent commits)
- ✓MIT licensed
- ✓CI configured
- ✓Tests present
Computed from maintenance signals — commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/microsoft/onnxruntime)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card
This card auto-renders when someone shares https://repopilot.app/r/microsoft/onnxruntime on X, Slack, or LinkedIn.
Ask AI about microsoft/onnxruntime
Grounded in the actual source code. Pick a starter question or write your own.
Onboarding doc
Onboarding: microsoft/onnxruntime
Generated by RepoPilot · 2026-06-24 · Source
🎯Verdict
GO — Healthy across the board
- Last commit today
- 38+ active contributors
- Distributed ownership (top contributor 14% of recent commits)
- MIT licensed
- CI configured
- Tests present
<sub>Computed from maintenance signals — commit recency, contributor breadth, bus factor, license, CI, tests</sub>
⚡TL;DR
ONNX Runtime is a cross-platform inference and training accelerator that executes machine learning models in ONNX format across CPUs, GPUs (CUDA, ROCm), and specialized hardware. It optimizes model execution through graph transformations, operator fusion, and hardware-specific kernels—enabling models from PyTorch, TensorFlow, scikit-learn, and XGBoost to run faster and cheaper in production. Monorepo organized by execution provider: onnxruntime/core/ contains the graph optimizer and interpreter, onnxruntime/providers/ has hardware-specific backends (cuda/, cpu/, rocm/, webgpu/), onnxruntime/python/ wraps Python bindings, and cmake/ orchestrates multi-platform builds. Test suites in test/ mirror provider structure; examples in samples/ and partner repos.
👥Who it's for
ML engineers and data scientists deploying models to production who need to run ONNX models efficiently across heterogeneous hardware (x86, ARM, NVIDIA GPUs, WebGPU browsers); framework library maintainers (PyTorch, TensorFlow) integrating ONNX export; and systems engineers optimizing inference latency and cost at scale.
🌱Maturity & risk
Production-ready and actively maintained. The repo shows 254M lines of C++, comprehensive CI/CD in .github/actions/, and established release cadence at releases. The .agents/skills/ directory and pipeline automation configs (.config/1espt/PipelineAutobaseliningConfig.yml) indicate mature enterprise governance. This is a flagship Microsoft OSS project with consistent investment.
Standard open source risks apply.
Active areas of work
Active development on: WebGPU inference (.github/actions/webgpu-validate-shader-key/), CUDA attention kernels (.agents/skills/cuda-attention-kernel-patterns/), model training optimization for transformers, and multi-node GPU training. Recent focus on build system modernization and linting (.agents/skills/ort-lint/, .agents/skills/ort-build/).
🚀Get running
git clone https://github.com/microsoft/onnxruntime.git
cd onnxruntime
python -m pip install -e .
# Or for development with CUDA:
./build.sh --config Release --build_shared_lib --use_cuda --cuda_home /usr/local/cuda
Refer to BUILD.md (inferred from structure) for platform-specific flags.
Daily commands:
CPU inference: python -c "import onnxruntime as ort; sess = ort.InferenceSession('model.onnx'); output = sess.run(None, input_dict)". GPU inference: set providers=['CUDAExecutionProvider']. For training: use onnxruntime.training module with PyTorch models. Development server: ./build.sh outputs binaries; run tests via pytest test/python/ (inferred).
🗺️Map of the codebase
.github/workflows/linux_ci.yml— Primary Linux CI/CD pipeline that validates core functionality across platforms; understand this to debug build failures.github/workflows/windows_x64_release_build_x64_release.yml— Windows release build pipeline defining optimization flags and packaging; critical for platform-specific compilation issues.lintrunner.toml— Code quality enforcement configuration for pre-commit checks; all contributors must comply with these linting standards.clang-format— C++ code style enforcement; all native code contributions must follow this formatting standard.github/pull_request_template.md— PR submission checklist defining testing and documentation requirements for all contributionsCODEOWNERS— Ownership matrix for code review routing; identifies who approves changes to each subsystem.agents/skills/ort-build/SKILL.md— Documents the build system structure and common build issues; essential for compilation and integration work
🛠️How to make changes
Add Support for a New Hardware Backend (e.g., new GPU accelerator)
- Create new CI workflow in .github/workflows/ following naming pattern {platform}_{backend}.yml (e.g., linux_rocm_ci.yml) (
.github/workflows/linux_tensorrt_ci.yml (use as template)) - Define build matrix with target platform, backend SDK version, and validation test suite (
new file: .github/workflows/{platform}_{backend}_ci.yml) - Add skill documentation explaining kernel patterns and optimization guidelines for this backend (
.agents/skills/{backend}-kernel-patterns/SKILL.md) - Update CODEOWNERS to assign code review approval for backend-specific paths (
CODEOWNERS) - Integrate linting rules for backend-specific code in .lintrunner.toml if language-specific checks needed (
.lintrunner.toml)
Enforce a New Code Quality Rule (linting, formatting, type checking)
- Add tool configuration to centralized linting config (e.g., new Ruff rule, mypy strict mode) (
.lintrunner.toml) - For C++ rules, update clang-format or clang-tidy configuration (
.clang-tidy or .clang-format) - Create or update GitHub Actions workflow to run linter in CI on every PR (
.github/workflows/lint.yml) - Document the new rule and rationale in agent skill documentation (
.agents/skills/ort-lint/SKILL.md) - Add rule to pre-commit hook if it should block local commits (
.githooks/pre-commit)
Add a New Platform Variant (mobile, web, edge device)
- Create platform-specific workflow file with initialization and validation steps (
.github/workflows/{platform}.yml (e.g., .github/workflows/ios.yml as template)) - Define issue template for platform-specific bug reports (
.github/ISSUE_TEMPLATE/{platform}-issue.yml) - Add custom GitHub Action for platform setup if needed (NDK, SDK, emulator) (
.github/actions/{platform}-setup/action.yml) - Document build and test procedures specific to this platform (
.agents/skills/ort-build/SKILL.md (add platform section)) - Update CODEOWNERS with platform maintainers (
CODEOWNERS)
Create a New Automated Policy (issue triage, labeling, stale cleanup)
- Define policy workflow using GitHub Actions or scheduled job (
new file: .github/policies/policy-name.yml) - Configure event triggers (schedule, pull_request, issues) and action steps (
.github/policies/policy-name.yml) - If using labels, reference existing label definitions in labeler workflow (
.github/labeler.yml or .github/title-only-labeler.yml) - Document the policy purpose and maintenance in root-level governance documentation (
AGENTS.md or README.md)
🔧Why these technologies
- GitHub Actions (15+ workflow files) — Native integration with repository; enables matrix builds across OS/architecture/backend combinations without external infrastructure
- Docker & devcontainer (Dockerfile, devcontainer.json) — Standardizes build environment for contributors and CI runners; ensures reproducible builds across local/cloud scenarios
- YAML-based workflow definitions — Human-readable, version-controlled CI/CD configuration; enables community auditing and contribution guidelines enforcement
🪤Traps & gotchas
CUDA dependency version: cuDNN and cuBLAS versions must match TensorRT if enabled; mismatches cause silent type casting errors. Graph serialization: ONNX model opset version must be supported by the runtime version—no clear runtime error for unsupported ops until execution. Multi-provider execution: if multiple providers are enabled (e.g., CUDA + CPU), the fallback logic may mask slow-path behavior—enable verbose logging with loglevel=1 in SessionOptions. Python GIL: inference calls release the GIL but some post-processing doesn't, causing thread contention in multi-threaded serving. WebGPU shaders: .github/actions/webgpu-validate-shader-key/ expects specific browser/GPU combos; shader compilation errors are deferred to runtime. Pre-commit hooks: .githooks/pre-commit enforces clang-format; failure will block commits unless configured in git config.
🏗️Architecture
💡Concepts to learn
- Graph Optimization / Operator Fusion — ONNX Runtime's core performance advantage comes from fusing multiple ONNX operators into single optimized kernels (e.g., BatchNorm + ReLU + Conv → single GPU kernel); understanding the transformer pipeline in
onnxruntime/core/optimizer/is essential to diagnosing slow models. - Execution Providers — ONNX Runtime abstracts hardware via pluggable providers (CPU, CUDA, ROCm, CoreML, WebGPU); knowing which provider your model will use and its kernel coverage is critical for deployment—provider selection determines performance and compatibility.
- ONNX Opset Versioning — ONNX models declare an opset version (e.g., opset 16, 17); ONNX Runtime supports a specific opset range per version. Model exported with opset 18 may fail silently on Runtime compiled for opset ≤17, making version alignment a hidden deployment trap.
- Memory Arena / Session State — ONNX Runtime pre-allocates memory arenas per session to avoid fragmentation; understanding SessionOptions.graph_optimization_level and arena allocation is crucial for production serving where memory footprint or latency spikes matter.
- CUDA Kernel Fusion & JIT Compilation — The CUDA provider uses custom kernels (in
onnxruntime/providers/cuda/) and sometimes runtime JIT compilation for dynamic shapes; unoptimized kernels or shape mismatches can cause 10x slowdowns—examining kernel launch configs in NVPROF is often necessary. - WebGPU Shader Validation — WebGPU support (
.github/actions/webgpu-validate-shader-key/) is emerging; understanding WGSL (WebGPU Shading Language) and browser GPU compatibility matrices is essential for inference in browser environments—shader compilation failures are deferred to runtime. - Distributed Training with NVIDIA Collective Communications (NCCL) — ONNX Runtime's training module uses NCCL for multi-GPU synchronization in transformer training; understanding gradient synchronization patterns and AllReduce operations is critical for scaling training across nodes without bottlenecks.
🔗Related repos
onnx/onnx— The ONNX format specification and serialization library that ONNX Runtime consumes; fundamental dependency for model parsing and validationmicrosoft/onnxruntime-inference-examples— Official companion repo with end-to-end inference examples (mobile, web, cloud) showing how to use ONNX Runtime in production scenariosmicrosoft/onnxruntime-training-examples— Partner repo demonstrating distributed training on multi-GPU setups using ONNX Runtime's training APIs with PyTorch modelspytorch/pytorch— Primary upstream consumer—PyTorch's ONNX export targets ONNX Runtime for optimized inference; tight integration throughtorch.onnx.export()tensorflow/tensorflow— Second major upstream consumer—TensorFlow/Keras models export to ONNX format which ONNX Runtime executes; ecosystem interop point
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive CI workflow for WebGPU shader validation and cross-platform testing
The repo has .github/workflows/linux_webgpu.yml but lacks dedicated shader key validation across macOS and Windows platforms. The WebGPU shader validation action exists (.github/actions/webgpu-validate-shader-key/) but isn't fully integrated into cross-platform CI. This would catch shader compatibility issues earlier and reduce runtime failures on different OS/GPU combinations.
- [ ] Create
.github/workflows/windows_webgpu.ymlmirroring linux_webgpu.yml structure with Windows-specific setup - [ ] Create
.github/workflows/macos_webgpu.ymlusing.github/actions/macos-ci-setupfor macOS GPU testing - [ ] Update
.github/actions/webgpu-validate-shader-key/action.ymlto support cross-platform shader compilation validation - [ ] Reference
.github/actions/webgpu-validate-shader-key/validate-shader-key.jsto ensure it handles platform-specific paths and shader compilation flags
Create missing integration tests for ONNX Runtime Build Skills in CI pipeline
The .agents/skills/ directory contains build and lint skill definitions (.agents/skills/ort-build/SKILL.md, ort-lint/SKILL.md, ort-test/SKILL.md) but there's no CI workflow validating that these agent skills execute correctly. This causes latent issues in agent-based build orchestration to go undetected until manual testing.
- [ ] Create
.github/workflows/agent-skills-validation.ymlthat invokes each skill from.agents/skills/*/SKILL.md - [ ] Add test matrix covering the CUDA attention kernel patterns skill (
.agents/skills/cuda-attention-kernel-patterns/) with actual CUDA compilation - [ ] Validate security-relevant skill execution:
.agents/skills/python-kwargs-setattr-security/SKILL.mdagainst test Python code - [ ] Reference existing Linux CUDA CI (
.github/workflows/linux_cuda_ci.yml) for GPU resource allocation patterns
Add linting and validation CI for build configuration files across platforms
The repo contains multiple configuration files (.github/workflows/, .config/1espt/PipelineAutobaseliningConfig.yml, .clang-format, .clang-tidy) but lacks validation that these configs are syntactically correct and internally consistent across platform-specific builds. This prevents config drift between Linux, Windows, and macOS builds that use different toolchains.
- [ ] Create
.github/workflows/config-validation.ymlthat runs before primary CI workflows - [ ] Add YAML schema validation for
.config/1espt/PipelineAutobaseliningConfig.ymland.config/tsaoptions.jsonusing dedicated linting tools - [ ] Validate
.clang-formatand.clang-tidyconsistency by testing against sample source files across different C++ standards - [ ] Add cross-platform path resolution checks (reference
.github/actions/locate-vcvarsall-and-setup-env/locate_vcvarsall.batpatterns) to ensure build scripts work on Windows CI agents
🌿Good first issues
- Add missing operator documentation to the operator registry: scan
onnxruntime/core/providers/cpu/for kernels without docstrings in their.ccfiles and add structured comments mapping ONNX spec to implementation details. - Extend ONNX Runtime Python type hints:
onnxruntime/python/binding files lack complete PEP 484 type stubs; generate.pyifiles forInferenceSession,SessionOptions, andIOBindingclasses to enable IDE autocomplete. - Build system modernization: the CMakeLists.txt uses legacy
find_package()patterns; refactor CUDA/cuDNN/TensorRT discovery to use modernfind_package()config modules, improving Windows/macOS support where these are not on system PATH.
⭐Top contributors
Click to expand
Top contributors
- @Copilot — 14 commits
- @tianleiwu — 11 commits
- @edgchen1 — 8 commits
- @sanaa-hamel-microsoft — 7 commits
- @Rishi-Dave — 6 commits
📝Recent commits
Click to expand
Recent commits
b8ba448— [C#] Add EP tests for Cuda Plugin (#28375) (yuslepukhin)245118a— [CUDA Plugin EP] Add NuGet packaging pipeline (#28378) (tianleiwu)058ce95— Fix CPU Attention softcap/attn_mask ordering (onnx#7867, #7913) + consolidate CUDA spec coverage tests from #28371 (#283 (titaiwangms)505e0c3— Suppress test warnings in transformers tests and fix CUDA CI (#28391) (tianleiwu)ec55d3c— Fix Subgraph_t issues with TRT RTX ver 1.5.x (#28361) (umangb-09)0a341b0— [WebGPU plugin EP packaging] Remove explicit ORT package dependency (#28384) (edgchen1)bf76a0b— feat(quantization): add calibration cache to quantize_static (#28221) (Rishi-Dave)e3c34da— Refactor and modernize StringNormalizer. (#28320) (yuslepukhin)19738c5— [Plugin EP] Add OrtEp::OnSessionInitializationEnd() (#28319) (adrianlizarraga)470977a— [CoreML EP] Support pre-opset-13 Split via 'split' attribute (#28270) (maxwbuckley)
🔒Security observations
ONNX Runtime demonstrates reasonable security practices with established vulnerability reporting procedures and some security tooling integration (Guardian, CodeQL, clang-tidy). However, several areas require attention: Python deserialization patterns (setattr with kwargs), CUDA kernel memory safety, custom pre-commit hook verification, and Docker/container hardening. The codebase appears to handle C++, Python, and CUDA code with varying complexity levels, each requiring different security considerations. No obvious hardcoded secrets detected in file structure, but deeper code analysis is needed. Recommend enhancing SCA, SAST coverage, and implementing security-focused code review processes for complex components (kernels, Python reflection/introspection).
- Medium · Potential Insecure Pre-Commit Hooks —
.githooks/pre-commit. Pre-commit hooks are present in .githooks/pre-commit, which could be modified to execute arbitrary code. If repository is cloned without proper verification or if hooks are automatically installed, this could be exploited. Fix: Ensure pre-commit hooks are properly signed/verified, document hook installation requirements, and require explicit user consent before installing hooks. Consider using a tool like husky with verification. - Medium · Insecure Deserialization Risk in Python Code —
.agents/skills/python-kwargs-setattr-security/SKILL.md. The skill file '.agents/skills/python-kwargs-setattr-security/SKILL.md' suggests there may be security patterns related to unsafe setattr() usage with kwargs in the codebase. Unsafe use of setattr() with user-controlled input can lead to arbitrary attribute modification. Fix: Audit all uses of setattr() with user-controlled input. Implement strict whitelist validation for attribute names. Consider using dataclass or pydantic models with controlled field assignment instead of dynamic setattr(). - Medium · Potential CUDA Kernel Vulnerabilities —
.agents/skills/cuda-attention-kernel-patterns/SKILL.md. The presence of '.agents/skills/cuda-attention-kernel-patterns/SKILL.md' indicates custom CUDA kernel code. CUDA kernels can have memory safety issues, buffer overflows, or race conditions that could lead to information disclosure or DoS. Fix: Implement rigorous code review process for CUDA kernels. Use NVIDIA's cuda-memcheck and sanitizers. Ensure bounds checking on all memory accesses. Test with various input sizes and data types. - Low · Missing CODEOWNERS File —
.github/. No CODEOWNERS file detected in .github/ directory. This could make it harder to enforce security reviews on sensitive code changes. Fix: Create a .github/CODEOWNERS file to ensure security-sensitive code (crypto, authentication, kernel code) requires review from designated security reviewers. - Low · No Evidence of Security Configuration Baseline —
.config/guardian/.gdnbaselines. While .config/guardian/.gdnbaselines exists for Guardian scanning, comprehensive security baselines and checks may not be fully enforced in the CI/CD pipeline. Fix: Ensure SAST tools are integrated into all CI/CD workflows. Add security scanning for: SCA (Software Composition Analysis), secret scanning, SAST, and container scanning in addition to existing linting. - Low · Docker Image Build Security —
.devcontainer/Dockerfile. Dockerfile present in .devcontainer/ but without evidence of multi-stage builds, minimal base images, or security hardening patterns from file structure analysis. Fix: Use minimal base images (alpine/distroless). Implement multi-stage builds. Run containers as non-root user. Scan Docker images with Trivy or similar tools. Use specific base image versions (not 'latest').
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/microsoft/onnxruntime shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live microsoft/onnxruntime
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/microsoft/onnxruntime.
What it runs against: a local clone of microsoft/onnxruntime — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in microsoft/onnxruntime | Confirms the artifact applies here, not a fork |
| 2 | License is still MIT | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 30 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of microsoft/onnxruntime. If you don't
# have one yet, run these first:
#
# git clone https://github.com/microsoft/onnxruntime.git
# cd onnxruntime
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of microsoft/onnxruntime and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "microsoft/onnxruntime(\\.git)?\\b" \\
&& ok "origin remote is microsoft/onnxruntime" \\
|| miss "origin remote is not microsoft/onnxruntime (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
&& ok "license is MIT" \\
|| miss "license drift — was MIT at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f ".github/workflows/linux_ci.yml" \\
&& ok ".github/workflows/linux_ci.yml" \\
|| miss "missing critical file: .github/workflows/linux_ci.yml"
test -f ".github/workflows/windows_x64_release_build_x64_release.yml" \\
&& ok ".github/workflows/windows_x64_release_build_x64_release.yml" \\
|| miss "missing critical file: .github/workflows/windows_x64_release_build_x64_release.yml"
test -f ".lintrunner.toml" \\
&& ok ".lintrunner.toml" \\
|| miss "missing critical file: .lintrunner.toml"
test -f ".clang-format" \\
&& ok ".clang-format" \\
|| miss "missing critical file: .clang-format"
test -f ".github/pull_request_template.md" \\
&& ok ".github/pull_request_template.md" \\
|| miss "missing critical file: .github/pull_request_template.md"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 30 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~0d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/microsoft/onnxruntime"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.
Embed this chat in your README →
Drop this iframe anywhere — the widget runs against the same live analysis cache as the main app.
<iframe src="https://repopilot.app/embed/microsoft/onnxruntime" width="100%" height="500" style="border:1px solid #d0d7de; border-radius:8px;" allow="microphone" loading="lazy" ></iframe>