interpretml/interpret

Item: interpretml/interpret
Rating: 5
Author: RepoPilot

Fit interpretable models. Explain blackbox machine learning.

Healthy

Healthy across all four use cases

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 2d ago
✓6 active contributors
✓MIT licensed

Show 3 more →

✓CI configured
✓Tests present
⚠Single-maintainer risk — top contributor 93% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/interpretml/interpret)](https://repopilot.app/r/interpretml/interpret)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/interpretml/interpret on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: interpretml/interpret

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/interpretml/interpret shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

Last commit 2d ago
6 active contributors
MIT licensed
CI configured
Tests present
⚠ Single-maintainer risk — top contributor 93% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live interpretml/interpret repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/interpretml/interpret.

What it runs against: a local clone of interpretml/interpret — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in interpretml/interpret | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 32 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>interpretml/interpret</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of interpretml/interpret. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/interpretml/interpret.git
#   cd interpret
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of interpretml/interpret and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "interpretml/interpret(\\.git)?\\b" \\
  && ok "origin remote is interpretml/interpret" \\
  || miss "origin remote is not interpretml/interpret (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "python/interpret/glassbox/ebm/ebm.py" \\
  && ok "python/interpret/glassbox/ebm/ebm.py" \\
  || miss "missing critical file: python/interpret/glassbox/ebm/ebm.py"
test -f "python/interpret/api/_api.py" \\
  && ok "python/interpret/api/_api.py" \\
  || miss "missing critical file: python/interpret/api/_api.py"
test -f "R/src/interpret_R.cpp" \\
  && ok "R/src/interpret_R.cpp" \\
  || miss "missing critical file: R/src/interpret_R.cpp"
test -f "docs/interpret/interpret.md" \\
  && ok "docs/interpret/interpret.md" \\
  || miss "missing critical file: docs/interpret/interpret.md"
test -f "CONTRIBUTING.md" \\
  && ok "CONTRIBUTING.md" \\
  || miss "missing critical file: CONTRIBUTING.md"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 32 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~2d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/interpretml/interpret"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

InterpretML is an open-source package that provides interpretable machine learning models (particularly Explainable Boosting Machines/EBMs) and black-box model explanation techniques. It bridges the gap between model accuracy and interpretability by implementing GAMs enhanced with gradient boosting, automatic interaction detection, and feature importance attribution—enabling data scientists to understand both global model behavior and individual prediction reasoning. Monorepo structure: Python package under python/ with core model implementations, R bindings under R/ that wrap C++ native code (R/src/interpret_R.cpp), C++ engine in libebm. Documentation lives in docs/ with Jupyter notebooks (docs/benchmarks/, docs/interpret/) and sphinx config. Native compilation bridges: R/src/Makevars.interpret and platform-specific build files (build.sh, build.bat, R/src/interpret-win.def) handle cross-platform compilation. CI orchestration via GitHub Actions workflows.

👥Who it's for

Data scientists and ML engineers who need to explain model decisions for regulatory compliance (healthcare, finance, judicial systems), debug model failures, detect fairness issues, and build trustworthy AI systems in high-risk applications. Also useful for researchers studying interpretability techniques and practitioners building feature engineering pipelines.

🌱Maturity & risk

Production-ready and actively maintained. The project shows strong maturity: it's backed by Microsoft Research, has comprehensive CI/CD pipelines (.github/workflows/ci.yml, release workflows), extensive documentation (docs/ directory with Jupyter notebooks and benchmarks), multi-language bindings (Python, R, C++), and maintains backward compatibility (CHANGELOG.md). Last activity appears recent based on the maintained badge showing 2099.

Low-to-medium risk for production use. The C++/Python hybrid architecture (2.4M lines C++, 1.6M Python) creates complexity in build/deployment, though build.sh and build.bat scripts exist. The codebase is substantial and relies on custom libebm native library compilation (R/src/libebm/), which can cause platform-specific issues. Risk is mitigated by active Microsoft stewardship, comprehensive CI, and established governance (GOVERNANCE.md, MAINTAINERS.md).

Active areas of work

Active development with separate release pipelines for interpret, powerlift (likely optimization/acceleration), and language bindings. The repo maintains pre-commit hooks (.pre-commit-config.yaml) for code quality, supports Python 3.10+, and appears to be expanding platform coverage (Bicep infrastructure templates suggest cloud integration). Benchmarking work visible in docs/benchmarks/ suggests ongoing performance optimization.

🚀Get running

Clone and install for Python development: git clone https://github.com/interpretml/interpret.git && cd interpret && pip install -e . (or conda install -c conda-forge interpret). For R: cd R && Rscript build.R. For C++ development: ./build.sh (Linux/Mac) or build.bat (Windows). Verify with: python -c 'from interpret import ExplainableBoostingClassifier; print("Ready")'

Daily commands: Python: pip install interpret then use directly in Python: from interpret.glassbox import ExplainableBoostingClassifier; ebm = ExplainableBoostingClassifier(); ebm.fit(X, y); ebm.explain_global(). For development: clone, pip install -e ., run tests via CI (see .github/workflows/ci.yml). R: install.packages('interpret') or build from R/ directory. Jupyter notebooks in docs/interpret/python/examples/ are runnable.

🗺️Map of the codebase

python/interpret/glassbox/ebm/ebm.py — Core implementation of Explainable Boosting Machines (EBM), the flagship interpretable model—essential for understanding the library's primary offering.
python/interpret/api/_api.py — Central API registry and initialization that all modules depend on for exposing models, explainers, and visualizations.
R/src/interpret_R.cpp — R bindings to the native C++ libebm engine; required reading for R users and maintainers managing cross-language compatibility.
docs/interpret/interpret.md — High-level framework documentation defining the library's architecture, layers, and philosophy—foundational reference for all contributors.
CONTRIBUTING.md — Project governance and contribution workflow; every contributor must follow these guidelines for builds, testing, and pull requests.
.github/workflows/ci.yml — CI/CD pipeline defining test matrix, build steps, and release gates—critical for understanding how code is validated before merge.
README.md — Entry point summarizing the library's purpose, scope, and quick-start examples—establishes mental model for all subsequent reading.

🛠️How to make changes

Add a new interpretable model (Glassbox)

Create a new file in python/interpret/glassbox/<model_name>/<model_name>.py implementing the class with fit() and predict() methods. (python/interpret/glassbox/<model_name>/<model_name>.py)
Inherit from BaseExplainer or ModelExplainer in python/interpret/api/_base.py to adopt the standard interface. (python/interpret/api/_base.py)
Register the model in python/interpret/api/_api.py under the appropriate section (e.g., glassbox dict) so it is exported in the public API. (python/interpret/api/_api.py)
Add unit tests in python/interpret/test/glassbox/test_<model_name>.py covering fit, predict, and explain flows. (python/interpret/test/glassbox/test_<model_name>.py)
Create a Jupyter notebook tutorial in docs/interpret/<model_name>.ipynb demonstrating usage and interpretability. (docs/interpret/<model_name>.ipynb)

Add a new blackbox explainer

Create a new file in python/interpret/blackbox/<explainer_name>.py implementing the explanation algorithm. (python/interpret/blackbox/<explainer_name>.py)
Inherit from ModelExplainer base class and implement explain_global() and/or explain_local() methods. (python/interpret/api/_base.py)
Register in python/interpret/api/_api.py under the blackbox dict. (python/interpret/api/_api.py)
Implement Plotly visualization logic in the model's _explain_*() method or a separate visual/<explainer_name>.py module. (python/interpret/visual/<explainer_name>.py)
Add tests in python/interpret/test/blackbox/test_<explainer_name>.py and a notebook in docs/interpret/<explainer_name>.ipynb. (python/interpret/test/blackbox/test_<explainer_name>.py)

Add support for a new language binding (e.g., Julia, .NET)

Create a new directory <lang>/ at repo root (e.g., julia/, dotnet/) mirroring the R structure. (<lang>/)
Write language-specific bindings to libebm in <lang>/src/interpret_<lang>.cpp (or native code for that language). (<lang>/src/interpret_<lang>.cpp)
Create build configuration (Makevars, CMakeLists.txt, or language-native build files) in <lang>/src/. (<lang>/src/Makevars)
Add a GitHub Actions job in .github/workflows/ci.yml to test and release the new binding. (.github/workflows/ci.yml)
Document setup and API in docs/interpret/<lang>/installation-guide.ipynb and API reference notebooks. (docs/interpret/<lang>/installation-guide.ipynb)

Improve EBM performance or add a new EBM feature

Modify the core algorithm in python/interpret/glassbox/ebm/ebm.py or its C++ equivalent in R/src/libebm/. (python/interpret/glassbox/ebm/ebm.py)
Update preprocessing or binning logic in python/interpret/utils/binning.py if needed. (python/interpret/utils/binning.py)
Add regression tests in python/interpret/test/glassbox/test_ebm.py to verify correctness and prevent regressions. (python/interpret/test/glassbox/test_ebm.py)
Update documentation in docs/interpret/ebm.ipynb and docs/interpret/ebm-internals*.ipynb notebooks. (docs/interpret/ebm.ipynb)
Update CHANGELOG.md and check CI in .github/workflows/ci.yml passes all tests and benchmarks. (CHANGELOG.md)

🪤Traps & gotchas

Native compilation required: C++ libebm must build successfully, which requires gcc/clang + specific versions. R package build needs Makevars.interpret setup—Windows uses interpret-win.def. Python wheels pre-compiled on CI but source installs need build tools. CI uses GitHub Actions with specific matrix (OS/Python/R versions)—not all combinations tested locally. Documentation builds require jupyter-book and sphinx (see requirements files in docs/). Pre-commit hooks enforce formatting—check .pre-commit-config.yaml before committing. Package has interdependencies between Python/R/C++ layers—changes to libebm API require coordination across all three.

🏗️Architecture

💡Concepts to learn

Generalized Additive Models (GAMs) — EBM is a modernized GAM that uses gradient boosting to learn per-feature functions while maintaining interpretability—understanding GAM structure (feature scoring + summation) is essential to grasping EBM's design
Automatic Interaction Detection — EBM automatically discovers feature interactions (two-way or higher) and encodes them as additional terms—this is interpret's innovation over classical GAMs and requires understanding pair-feature learning in the boosting loop
Gradient Boosting Decision Trees (GBDT) — EBM uses bagging + gradient boosting to fit shallow trees per feature/interaction—familiarity with boosting (residuals, learning rates, early stopping) explains how EBM achieves accuracy parity with XGBoost while staying interpretable
SHAP (SHapley Additive exPlanations) — Interpret includes SHAP explainers for black-box models in python/interpret/blackbox/; SHAP values provide unified framework for feature attribution that EBM natively computes via its additive structure
Feature Binning / Discretization — EBM bins continuous features (see R/R/binning.R) to enable interpretable scoring rules and visualization—binning strategy affects both accuracy and explainability, making it a core algorithmic choice
Native C++ Extensions / Ctypes Bridge — Interpret wraps C++ libebm via ctypes (Python) and Rcpp (R); understanding this FFI layer is crucial for debugging performance issues, porting to new platforms, or modifying the boosting algorithm
Monotonic Constraints in Boosting — EBM supports monotonicity constraints (e.g., 'higher income = higher loan approval probability') for domain-expert editability—implementing and validating these constraints during boosting is a key interpretability feature

shap/shap — Post-hoc model explanation using SHAP values; interpret includes SHAP explainers in python/interpret/blackbox/ and both repos solve black-box interpretation but EBM is intrinsically interpretable
microsoft/LightGBM — Gradient boosting baseline that interpret's EBM is compared against; LightGBM is fast but black-box, while EBM sacrifices some speed for interpretability using similar boosting mechanics
interpretml/ebm-core — Core C++ EBM algorithm library extracted from interpret—if it exists as separate repo, this is the low-level kernel that interpret wraps
slundberg/shap — Canonical SHAP implementation referenced in interpretML; represent the post-hoc explanation paradigm that interpret's glassbox models try to replace
christophM/interpretable-ml-book — Educational resource on interpretability techniques (PDP, LIME, SHAP)—conceptual foundation for understanding why interpret's approach differs from post-hoc methods

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for R bindings in R/src/interpret_R.cpp

The R package (R/src/interpret_R.cpp) lacks visible test coverage. Given that this repo supports both Python and R interfaces, adding R-specific unit tests would ensure the C++ bindings work correctly across platforms. This is critical for R users and reduces bugs in the native code layer.

[ ] Create R/tests directory structure mirroring standard R testing conventions (testthat or similar)
[ ] Write tests for ebm_classify, ebm_predict_proba, and ebm_show functions documented in R/man/
[ ] Test Makevars compilation on both Windows (R/src/Makevars.interpret and interpret-win.def) and Unix systems
[ ] Add R test execution to .github/workflows/ci.yml to ensure tests run on PR submissions
[ ] Document how to run R tests in CONTRIBUTING.md with specific commands

Add missing CI workflow for R package validation and CRAN compliance checks

The repo has ci.yml and release workflows for Python, but no dedicated R validation pipeline. The R package has DESCRIPTION, NAMESPACE, and CRAN formatting concerns (R/cran_formatted_licence.txt exists), suggesting CRAN submission is a goal. A workflow would validate R CMD CHECK, test across R versions, and catch CRAN compliance issues early.

[ ] Create .github/workflows/r-check.yml with r-lib/actions/setup-r, r-lib/actions/setup-r-dependencies, and r-lib/actions/check-r-package
[ ] Configure matrix testing for multiple R versions (4.0+) and platforms (ubuntu, windows, macos)
[ ] Add CRAN checks using rcmdcheck with allow-warnings: false for stricter validation
[ ] Integrate with existing ci.yml or keep separate but ensure both trigger on R/* file changes
[ ] Document R build prerequisites in CONTRIBUTING.md (e.g., R-devel, Rtools for Windows)

Create missing documentation for Python/R API parity in docs/interpret/

The docs folder has extensive Python notebooks (ebm.ipynb, ebm-internals*.ipynb, etc.) but no equivalent R documentation. New contributors and R users need parity docs showing how to call EBM, interpreters, and blackbox explainers in R with side-by-side Python examples. This reduces support burden and increases R adoption.

[ ] Create docs/interpret/r-quickstart.md mirroring Python getting-started content with R code examples
[ ] Create docs/interpret/r-ebm.Rmd or .md with R-specific EBM usage (referencing R/man/ebm_classify.Rd)
[ ] Add a comparison table in docs/interpret/framework.md showing feature availability across Python and R
[ ] Update docs/interpret/_toc.yml to include new R documentation sections in navigation
[ ] Reference R examples in CONTRIBUTING.md under 'Examples' section with link to R docs

🌿Good first issues

Add comprehensive docstring examples to python/interpret/blackbox/ explainer classes (LIME, SHAP, etc.) showing real dataset usage—currently missing runnable code snippets that would help new users understand alternative explanation techniques beyond EBM
Expand test coverage for R bindings (R/R/ functions like ebm_classify, ebm_predict_proba)—create test_*.R files in R/tests/ directory validating edge cases (missing values, factor ordering, probability thresholds) that Python tests already cover
Build automated benchmark comparison notebook: create docs/benchmarks/ebm-runtime-scaling.ipynb comparing EBM inference speed vs XGBoost/LightGBM across dataset sizes (10K→10M rows) to validate performance claims in README—currently only classification accuracy is benchmarked

⭐Top contributors

Click to expand

@paulbkoch — 93 commits
@ugbotueferhire — 3 commits
@ecederstrand — 1 commits
@Erik-BM — 1 commits
@mathias-von-ottenbreit — 1 commits

📝Recent commits

Click to expand

6c79a67 — add interaction detection callback (paulbkoch)
3495683 — move the measure_interactions function and it's dependencies into the _ebm_core module (paulbkoch)
2c68999 — change callback functions to allow stopping processing of the single current outer bag instead of terminating all boosti (paulbkoch)
9ab5089 — improve consistency of reporting of gain and boosting metric and allow additional parameters with defaults in the callba (paulbkoch)
59f7388 — feat: add tuple support and exam_cb for callbacks (#665) (ugbotueferhire)
e6d20f9 — resolve mypy warnings in _ebm.py (paulbkoch)
7ad1c39 — clear some mypy warnings (paulbkoch)
d6f7260 — add mypy (paulbkoch)
67563da — update type hints to python 3.10+ (paulbkoch)
af6bce0 — update to python 3.10+ consistently throughout project (paulbkoch)

🔒Security observations

The InterpretML repository demonstrates a reasonable security posture for an open-source ML library. No critical vulnerabilities were identified in the visible configuration and structure. Key concerns include: (1) Several outdated dependencies (IPython, debugpy) requiring updates, (2) absence of a security policy documentation, (3) potential information disclosure risk in Jupyter notebooks containing execution outputs. The codebase structure itself (R bindings, Python ML package) doesn't reveal obvious injection vectors, hardcoded secrets, or infrastructure misconfigurations. Regular dependency updates and the addition of security documentation would improve the overall posture.

Medium · Outdated IPython Dependency — requirements.txt - ipython==8.38.0. IPython 8.38.0 is significantly outdated (current stable is 8.10+). This version may contain known security vulnerabilities in Jupyter kernel execution and code introspection features. Fix: Update IPython to the latest stable version (8.10.0 or later) to patch known security issues in kernel execution and code completion.
Medium · Outdated Debugpy Dependency — requirements.txt - debugpy==1.8.20. Debugpy 1.8.20 is outdated. The debugger may have unpatched vulnerabilities in remote debugging functionality that could be exploited if debug ports are exposed. Fix: Update debugpy to version 1.8.21 or later to ensure debug protocol security patches are applied.
Low · Missing Security Policy — Repository root. No SECURITY.md file found in the repository root. This makes it difficult for security researchers to report vulnerabilities responsibly. Fix: Create a SECURITY.md file documenting the vulnerability disclosure process and contact information for security reports.
Low · Potential Information Disclosure in Documentation — docs/interpret/ directory containing multiple .ipynb files. Multiple Jupyter notebooks and documentation files are included in the repository (.ipynb files). These can inadvertently contain sensitive information like API keys, credentials, or internal paths if not properly sanitized. Fix: Implement pre-commit hooks to automatically strip sensitive output from Jupyter notebooks. Use nbstripout or similar tools to remove execution results and metadata.
Low · Missing Dependency Pinning in Primary Package — requirements.txt and setup.py (not shown). The provided requirements.txt appears to be for documentation/development builds. Primary package dependencies should be explicitly pinned to known-safe versions. Fix: Maintain a separate requirements-lock.txt file with exact versions for production deployments. Regularly audit and update dependencies.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

interpretml/interpret

Embed the "Healthy" badge

Onboarding doc

Onboarding: interpretml/interpret

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🛠️How to make changes

Add a new interpretable model (Glassbox)

Add a new blackbox explainer

Add support for a new language binding (e.g., Julia, .NET)

Improve EBM performance or add a new EBM feature

🪤Traps & gotchas

🏗️Architecture

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add comprehensive unit tests for R bindings in R/src/interpret_R.cpp

Add missing CI workflow for R package validation and CRAN compliance checks

Create missing documentation for Python/R API parity in docs/interpret/

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next