interpretml/interpret
Fit interpretable models. Explain blackbox machine learning.
Healthy across all four use cases
Permissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 2d ago
- ✓6 active contributors
- ✓MIT licensed
Show 3 more →Show less
- ✓CI configured
- ✓Tests present
- ⚠Single-maintainer risk — top contributor 93% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/interpretml/interpret)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/interpretml/interpret on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: interpretml/interpret
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/interpretml/interpret shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- Last commit 2d ago
- 6 active contributors
- MIT licensed
- CI configured
- Tests present
- ⚠ Single-maintainer risk — top contributor 93% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live interpretml/interpret
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/interpretml/interpret.
What it runs against: a local clone of interpretml/interpret — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in interpretml/interpret | Confirms the artifact applies here, not a fork |
| 2 | License is still MIT | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 32 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of interpretml/interpret. If you don't
# have one yet, run these first:
#
# git clone https://github.com/interpretml/interpret.git
# cd interpret
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of interpretml/interpret and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "interpretml/interpret(\\.git)?\\b" \\
&& ok "origin remote is interpretml/interpret" \\
|| miss "origin remote is not interpretml/interpret (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
&& ok "license is MIT" \\
|| miss "license drift — was MIT at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "python/interpret/glassbox/ebm/ebm.py" \\
&& ok "python/interpret/glassbox/ebm/ebm.py" \\
|| miss "missing critical file: python/interpret/glassbox/ebm/ebm.py"
test -f "python/interpret/api/_api.py" \\
&& ok "python/interpret/api/_api.py" \\
|| miss "missing critical file: python/interpret/api/_api.py"
test -f "R/src/interpret_R.cpp" \\
&& ok "R/src/interpret_R.cpp" \\
|| miss "missing critical file: R/src/interpret_R.cpp"
test -f "docs/interpret/interpret.md" \\
&& ok "docs/interpret/interpret.md" \\
|| miss "missing critical file: docs/interpret/interpret.md"
test -f "CONTRIBUTING.md" \\
&& ok "CONTRIBUTING.md" \\
|| miss "missing critical file: CONTRIBUTING.md"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 32 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~2d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/interpretml/interpret"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
InterpretML is an open-source package that provides interpretable machine learning models (particularly Explainable Boosting Machines/EBMs) and black-box model explanation techniques. It bridges the gap between model accuracy and interpretability by implementing GAMs enhanced with gradient boosting, automatic interaction detection, and feature importance attribution—enabling data scientists to understand both global model behavior and individual prediction reasoning. Monorepo structure: Python package under python/ with core model implementations, R bindings under R/ that wrap C++ native code (R/src/interpret_R.cpp), C++ engine in libebm. Documentation lives in docs/ with Jupyter notebooks (docs/benchmarks/, docs/interpret/) and sphinx config. Native compilation bridges: R/src/Makevars.interpret and platform-specific build files (build.sh, build.bat, R/src/interpret-win.def) handle cross-platform compilation. CI orchestration via GitHub Actions workflows.
👥Who it's for
Data scientists and ML engineers who need to explain model decisions for regulatory compliance (healthcare, finance, judicial systems), debug model failures, detect fairness issues, and build trustworthy AI systems in high-risk applications. Also useful for researchers studying interpretability techniques and practitioners building feature engineering pipelines.
🌱Maturity & risk
Production-ready and actively maintained. The project shows strong maturity: it's backed by Microsoft Research, has comprehensive CI/CD pipelines (.github/workflows/ci.yml, release workflows), extensive documentation (docs/ directory with Jupyter notebooks and benchmarks), multi-language bindings (Python, R, C++), and maintains backward compatibility (CHANGELOG.md). Last activity appears recent based on the maintained badge showing 2099.
Low-to-medium risk for production use. The C++/Python hybrid architecture (2.4M lines C++, 1.6M Python) creates complexity in build/deployment, though build.sh and build.bat scripts exist. The codebase is substantial and relies on custom libebm native library compilation (R/src/libebm/), which can cause platform-specific issues. Risk is mitigated by active Microsoft stewardship, comprehensive CI, and established governance (GOVERNANCE.md, MAINTAINERS.md).
Active areas of work
Active development with separate release pipelines for interpret, powerlift (likely optimization/acceleration), and language bindings. The repo maintains pre-commit hooks (.pre-commit-config.yaml) for code quality, supports Python 3.10+, and appears to be expanding platform coverage (Bicep infrastructure templates suggest cloud integration). Benchmarking work visible in docs/benchmarks/ suggests ongoing performance optimization.
🚀Get running
Clone and install for Python development: git clone https://github.com/interpretml/interpret.git && cd interpret && pip install -e . (or conda install -c conda-forge interpret). For R: cd R && Rscript build.R. For C++ development: ./build.sh (Linux/Mac) or build.bat (Windows). Verify with: python -c 'from interpret import ExplainableBoostingClassifier; print("Ready")'
Daily commands:
Python: pip install interpret then use directly in Python: from interpret.glassbox import ExplainableBoostingClassifier; ebm = ExplainableBoostingClassifier(); ebm.fit(X, y); ebm.explain_global(). For development: clone, pip install -e ., run tests via CI (see .github/workflows/ci.yml). R: install.packages('interpret') or build from R/ directory. Jupyter notebooks in docs/interpret/python/examples/ are runnable.
🗺️Map of the codebase
python/interpret/glassbox/ebm/ebm.py— Core implementation of Explainable Boosting Machines (EBM), the flagship interpretable model—essential for understanding the library's primary offering.python/interpret/api/_api.py— Central API registry and initialization that all modules depend on for exposing models, explainers, and visualizations.R/src/interpret_R.cpp— R bindings to the native C++ libebm engine; required reading for R users and maintainers managing cross-language compatibility.docs/interpret/interpret.md— High-level framework documentation defining the library's architecture, layers, and philosophy—foundational reference for all contributors.CONTRIBUTING.md— Project governance and contribution workflow; every contributor must follow these guidelines for builds, testing, and pull requests..github/workflows/ci.yml— CI/CD pipeline defining test matrix, build steps, and release gates—critical for understanding how code is validated before merge.README.md— Entry point summarizing the library's purpose, scope, and quick-start examples—establishes mental model for all subsequent reading.
🛠️How to make changes
Add a new interpretable model (Glassbox)
- Create a new file in
python/interpret/glassbox/<model_name>/<model_name>.pyimplementing the class withfit()andpredict()methods. (python/interpret/glassbox/<model_name>/<model_name>.py) - Inherit from
BaseExplainerorModelExplainerinpython/interpret/api/_base.pyto adopt the standard interface. (python/interpret/api/_base.py) - Register the model in
python/interpret/api/_api.pyunder the appropriate section (e.g.,glassboxdict) so it is exported in the public API. (python/interpret/api/_api.py) - Add unit tests in
python/interpret/test/glassbox/test_<model_name>.pycovering fit, predict, and explain flows. (python/interpret/test/glassbox/test_<model_name>.py) - Create a Jupyter notebook tutorial in
docs/interpret/<model_name>.ipynbdemonstrating usage and interpretability. (docs/interpret/<model_name>.ipynb)
Add a new blackbox explainer
- Create a new file in
python/interpret/blackbox/<explainer_name>.pyimplementing the explanation algorithm. (python/interpret/blackbox/<explainer_name>.py) - Inherit from
ModelExplainerbase class and implementexplain_global()and/orexplain_local()methods. (python/interpret/api/_base.py) - Register in
python/interpret/api/_api.pyunder theblackboxdict. (python/interpret/api/_api.py) - Implement Plotly visualization logic in the model's
_explain_*()method or a separatevisual/<explainer_name>.pymodule. (python/interpret/visual/<explainer_name>.py) - Add tests in
python/interpret/test/blackbox/test_<explainer_name>.pyand a notebook indocs/interpret/<explainer_name>.ipynb. (python/interpret/test/blackbox/test_<explainer_name>.py)
Add support for a new language binding (e.g., Julia, .NET)
- Create a new directory
<lang>/at repo root (e.g.,julia/,dotnet/) mirroring the R structure. (<lang>/) - Write language-specific bindings to libebm in
<lang>/src/interpret_<lang>.cpp(or native code for that language). (<lang>/src/interpret_<lang>.cpp) - Create build configuration (
Makevars,CMakeLists.txt, or language-native build files) in<lang>/src/. (<lang>/src/Makevars) - Add a GitHub Actions job in
.github/workflows/ci.ymlto test and release the new binding. (.github/workflows/ci.yml) - Document setup and API in
docs/interpret/<lang>/installation-guide.ipynband API reference notebooks. (docs/interpret/<lang>/installation-guide.ipynb)
Improve EBM performance or add a new EBM feature
- Modify the core algorithm in
python/interpret/glassbox/ebm/ebm.pyor its C++ equivalent inR/src/libebm/. (python/interpret/glassbox/ebm/ebm.py) - Update preprocessing or binning logic in
python/interpret/utils/binning.pyif needed. (python/interpret/utils/binning.py) - Add regression tests in
python/interpret/test/glassbox/test_ebm.pyto verify correctness and prevent regressions. (python/interpret/test/glassbox/test_ebm.py) - Update documentation in
docs/interpret/ebm.ipynbanddocs/interpret/ebm-internals*.ipynbnotebooks. (docs/interpret/ebm.ipynb) - Update CHANGELOG.md and check CI in
.github/workflows/ci.ymlpasses all tests and benchmarks. (CHANGELOG.md)
🪤Traps & gotchas
Native compilation required: C++ libebm must build successfully, which requires gcc/clang + specific versions. R package build needs Makevars.interpret setup—Windows uses interpret-win.def. Python wheels pre-compiled on CI but source installs need build tools. CI uses GitHub Actions with specific matrix (OS/Python/R versions)—not all combinations tested locally. Documentation builds require jupyter-book and sphinx (see requirements files in docs/). Pre-commit hooks enforce formatting—check .pre-commit-config.yaml before committing. Package has interdependencies between Python/R/C++ layers—changes to libebm API require coordination across all three.
🏗️Architecture
💡Concepts to learn
- Generalized Additive Models (GAMs) — EBM is a modernized GAM that uses gradient boosting to learn per-feature functions while maintaining interpretability—understanding GAM structure (feature scoring + summation) is essential to grasping EBM's design
- Automatic Interaction Detection — EBM automatically discovers feature interactions (two-way or higher) and encodes them as additional terms—this is interpret's innovation over classical GAMs and requires understanding pair-feature learning in the boosting loop
- Gradient Boosting Decision Trees (GBDT) — EBM uses bagging + gradient boosting to fit shallow trees per feature/interaction—familiarity with boosting (residuals, learning rates, early stopping) explains how EBM achieves accuracy parity with XGBoost while staying interpretable
- SHAP (SHapley Additive exPlanations) — Interpret includes SHAP explainers for black-box models in python/interpret/blackbox/; SHAP values provide unified framework for feature attribution that EBM natively computes via its additive structure
- Feature Binning / Discretization — EBM bins continuous features (see R/R/binning.R) to enable interpretable scoring rules and visualization—binning strategy affects both accuracy and explainability, making it a core algorithmic choice
- Native C++ Extensions / Ctypes Bridge — Interpret wraps C++ libebm via ctypes (Python) and Rcpp (R); understanding this FFI layer is crucial for debugging performance issues, porting to new platforms, or modifying the boosting algorithm
- Monotonic Constraints in Boosting — EBM supports monotonicity constraints (e.g., 'higher income = higher loan approval probability') for domain-expert editability—implementing and validating these constraints during boosting is a key interpretability feature
🔗Related repos
shap/shap— Post-hoc model explanation using SHAP values; interpret includes SHAP explainers in python/interpret/blackbox/ and both repos solve black-box interpretation but EBM is intrinsically interpretablemicrosoft/LightGBM— Gradient boosting baseline that interpret's EBM is compared against; LightGBM is fast but black-box, while EBM sacrifices some speed for interpretability using similar boosting mechanicsinterpretml/ebm-core— Core C++ EBM algorithm library extracted from interpret—if it exists as separate repo, this is the low-level kernel that interpret wrapsslundberg/shap— Canonical SHAP implementation referenced in interpretML; represent the post-hoc explanation paradigm that interpret's glassbox models try to replacechristophM/interpretable-ml-book— Educational resource on interpretability techniques (PDP, LIME, SHAP)—conceptual foundation for understanding why interpret's approach differs from post-hoc methods
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive unit tests for R bindings in R/src/interpret_R.cpp
The R package (R/src/interpret_R.cpp) lacks visible test coverage. Given that this repo supports both Python and R interfaces, adding R-specific unit tests would ensure the C++ bindings work correctly across platforms. This is critical for R users and reduces bugs in the native code layer.
- [ ] Create R/tests directory structure mirroring standard R testing conventions (testthat or similar)
- [ ] Write tests for ebm_classify, ebm_predict_proba, and ebm_show functions documented in R/man/
- [ ] Test Makevars compilation on both Windows (R/src/Makevars.interpret and interpret-win.def) and Unix systems
- [ ] Add R test execution to .github/workflows/ci.yml to ensure tests run on PR submissions
- [ ] Document how to run R tests in CONTRIBUTING.md with specific commands
Add missing CI workflow for R package validation and CRAN compliance checks
The repo has ci.yml and release workflows for Python, but no dedicated R validation pipeline. The R package has DESCRIPTION, NAMESPACE, and CRAN formatting concerns (R/cran_formatted_licence.txt exists), suggesting CRAN submission is a goal. A workflow would validate R CMD CHECK, test across R versions, and catch CRAN compliance issues early.
- [ ] Create .github/workflows/r-check.yml with r-lib/actions/setup-r, r-lib/actions/setup-r-dependencies, and r-lib/actions/check-r-package
- [ ] Configure matrix testing for multiple R versions (4.0+) and platforms (ubuntu, windows, macos)
- [ ] Add CRAN checks using rcmdcheck with allow-warnings: false for stricter validation
- [ ] Integrate with existing ci.yml or keep separate but ensure both trigger on R/* file changes
- [ ] Document R build prerequisites in CONTRIBUTING.md (e.g., R-devel, Rtools for Windows)
Create missing documentation for Python/R API parity in docs/interpret/
The docs folder has extensive Python notebooks (ebm.ipynb, ebm-internals*.ipynb, etc.) but no equivalent R documentation. New contributors and R users need parity docs showing how to call EBM, interpreters, and blackbox explainers in R with side-by-side Python examples. This reduces support burden and increases R adoption.
- [ ] Create docs/interpret/r-quickstart.md mirroring Python getting-started content with R code examples
- [ ] Create docs/interpret/r-ebm.Rmd or .md with R-specific EBM usage (referencing R/man/ebm_classify.Rd)
- [ ] Add a comparison table in docs/interpret/framework.md showing feature availability across Python and R
- [ ] Update docs/interpret/_toc.yml to include new R documentation sections in navigation
- [ ] Reference R examples in CONTRIBUTING.md under 'Examples' section with link to R docs
🌿Good first issues
- Add comprehensive docstring examples to python/interpret/blackbox/ explainer classes (LIME, SHAP, etc.) showing real dataset usage—currently missing runnable code snippets that would help new users understand alternative explanation techniques beyond EBM
- Expand test coverage for R bindings (R/R/ functions like ebm_classify, ebm_predict_proba)—create test_*.R files in R/tests/ directory validating edge cases (missing values, factor ordering, probability thresholds) that Python tests already cover
- Build automated benchmark comparison notebook: create docs/benchmarks/ebm-runtime-scaling.ipynb comparing EBM inference speed vs XGBoost/LightGBM across dataset sizes (10K→10M rows) to validate performance claims in README—currently only classification accuracy is benchmarked
⭐Top contributors
Click to expand
Top contributors
- @paulbkoch — 93 commits
- @ugbotueferhire — 3 commits
- @ecederstrand — 1 commits
- @Erik-BM — 1 commits
- @mathias-von-ottenbreit — 1 commits
📝Recent commits
Click to expand
Recent commits
6c79a67— add interaction detection callback (paulbkoch)3495683— move the measure_interactions function and it's dependencies into the _ebm_core module (paulbkoch)2c68999— change callback functions to allow stopping processing of the single current outer bag instead of terminating all boosti (paulbkoch)9ab5089— improve consistency of reporting of gain and boosting metric and allow additional parameters with defaults in the callba (paulbkoch)59f7388— feat: add tuple support and exam_cb for callbacks (#665) (ugbotueferhire)e6d20f9— resolve mypy warnings in _ebm.py (paulbkoch)7ad1c39— clear some mypy warnings (paulbkoch)d6f7260— add mypy (paulbkoch)67563da— update type hints to python 3.10+ (paulbkoch)af6bce0— update to python 3.10+ consistently throughout project (paulbkoch)
🔒Security observations
The InterpretML repository demonstrates a reasonable security posture for an open-source ML library. No critical vulnerabilities were identified in the visible configuration and structure. Key concerns include: (1) Several outdated dependencies (IPython, debugpy) requiring updates, (2) absence of a security policy documentation, (3) potential information disclosure risk in Jupyter notebooks containing execution outputs. The codebase structure itself (R bindings, Python ML package) doesn't reveal obvious injection vectors, hardcoded secrets, or infrastructure misconfigurations. Regular dependency updates and the addition of security documentation would improve the overall posture.
- Medium · Outdated IPython Dependency —
requirements.txt - ipython==8.38.0. IPython 8.38.0 is significantly outdated (current stable is 8.10+). This version may contain known security vulnerabilities in Jupyter kernel execution and code introspection features. Fix: Update IPython to the latest stable version (8.10.0 or later) to patch known security issues in kernel execution and code completion. - Medium · Outdated Debugpy Dependency —
requirements.txt - debugpy==1.8.20. Debugpy 1.8.20 is outdated. The debugger may have unpatched vulnerabilities in remote debugging functionality that could be exploited if debug ports are exposed. Fix: Update debugpy to version 1.8.21 or later to ensure debug protocol security patches are applied. - Low · Missing Security Policy —
Repository root. No SECURITY.md file found in the repository root. This makes it difficult for security researchers to report vulnerabilities responsibly. Fix: Create a SECURITY.md file documenting the vulnerability disclosure process and contact information for security reports. - Low · Potential Information Disclosure in Documentation —
docs/interpret/ directory containing multiple .ipynb files. Multiple Jupyter notebooks and documentation files are included in the repository (.ipynb files). These can inadvertently contain sensitive information like API keys, credentials, or internal paths if not properly sanitized. Fix: Implement pre-commit hooks to automatically strip sensitive output from Jupyter notebooks. Use nbstripout or similar tools to remove execution results and metadata. - Low · Missing Dependency Pinning in Primary Package —
requirements.txt and setup.py (not shown). The provided requirements.txt appears to be for documentation/development builds. Primary package dependencies should be explicitly pinned to known-safe versions. Fix: Maintain a separate requirements-lock.txt file with exact versions for production deployments. Regularly audit and update dependencies.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.