huggingface/transformers
๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Healthy across the board
worst-casePermissive license, no critical CVEs, actively maintained โ safe to depend on.
Has a license, tests, and CI โ clean foundation to fork and modify.
Documented and popular โ useful reference codebase to read through.
No critical CVEs, sane security posture โ runnable as-is.
- โLast commit today
- โ5 active contributors
- โDistributed ownership (top contributor 23%)
- โApache-2.0 licensed
- โCI configured
- โTests present
- โ Small team โ 5 top contributors
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Embed this verdict
[](https://repopilot.app/r/huggingface/transformers)Paste into your README โ the badge live-updates from the latest cached analysis.
Onboarding doc
Onboarding: huggingface/transformers
Generated by RepoPilot ยท 2026-05-05 ยท Source
Verdict
GO โ Healthy across the board
- Last commit today
- 5 active contributors
- Distributed ownership (top contributor 23%)
- Apache-2.0 licensed
- CI configured
- Tests present
- โ Small team โ 5 top contributors
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live huggingface/transformers
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale โ regenerate it at
repopilot.app/r/huggingface/transformers.
What it runs against: a local clone of huggingface/transformers โ the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in huggingface/transformers | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit โค 30 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of huggingface/transformers. If you don't
# have one yet, run these first:
#
# git clone https://github.com/huggingface/transformers.git
# cd transformers
#
# Then paste this script. Every check is read-only โ no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of huggingface/transformers and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "huggingface/transformers(\\.git)?\\b" \\
&& ok "origin remote is huggingface/transformers" \\
|| miss "origin remote is not huggingface/transformers (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift โ was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "README.md" \\
&& ok "README.md" \\
|| miss "missing critical file: README.md"
test -f "CONTRIBUTING.md" \\
&& ok "CONTRIBUTING.md" \\
|| miss "missing critical file: CONTRIBUTING.md"
test -f ".github/PULL_REQUEST_TEMPLATE.md" \\
&& ok ".github/PULL_REQUEST_TEMPLATE.md" \\
|| miss "missing critical file: .github/PULL_REQUEST_TEMPLATE.md"
test -f ".github/workflows/pr-ci-caller.yml" \\
&& ok ".github/workflows/pr-ci-caller.yml" \\
|| miss "missing critical file: .github/workflows/pr-ci-caller.yml"
test -f "MIGRATION_GUIDE_V5.md" \\
&& ok "MIGRATION_GUIDE_V5.md" \\
|| miss "missing critical file: MIGRATION_GUIDE_V5.md"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 30 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~0d)"
else
miss "last commit was $days_since_last days ago โ artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) โ safe to trust"
else
echo "artifact has $fail stale claim(s) โ regenerate at https://repopilot.app/r/huggingface/transformers"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
TL;DR
Transformers is the core framework for loading, fine-tuning, and deploying state-of-the-art pre-trained models (BERT, GPT, Vision Transformers, etc.) across NLP, vision, audio, and multimodal tasks. It abstracts away PyTorch/TensorFlow implementation details and provides unified APIs to run inference and training on models hosted on Hugging Face Hub, with 66MB+ of Python code implementing 1000+ model architectures. Monorepo structure: src/transformers/ contains the core library organized by task (models/, modeling_*.py files), with utils/, feature_extraction/, tokenizers/ as subdirectories. .github/workflows/ orchestrates CI across multiple model test groups via model_jobs.yml and pr-ci-caller.yml. .circleci/ provides additional job scheduling with parse_test_outputs.py aggregating results.
Who it's for
ML engineers and researchers who need to build production NLP/vision systems quickly without implementing transformer architectures from scratch; data scientists fine-tuning pre-trained models for specific domains; platform teams integrating LLMs into applications via the Hub.
Maturity & risk
Highly mature and production-ready. The project has extensive CI/CD coverage (.github/workflows/ contains 50+ automated test jobs including CircleCI, benchmarking, and DocTests), dense test suites for every model, and daily commits. It's the de facto standard transformer library in industry and academia with millions of weekly downloads.
Low technical risk for core functionality, but high operational burden: the monorepo supports 1000+ model variants across PyTorch/TensorFlow/JAX, making dependency management complex. The massive codebase (66MB Python) and model diversity mean regressions in one model can silently break another. Community-driven model contributions (.github/ISSUE_TEMPLATE/new-model-addition.yml) introduce variable code quality.
Active areas of work
Active development evident from dense workflow files (build-ci-docker-images.yml, benchmark_v2*.yml suggest recent infra upgrades), new model addition templates, and integration testing infrastructure (extras-smoke-test.yml, doctest_job.yml). The .ai/skills/ directory hints at ongoing development of automated model addition tools.
Get running
git clone https://github.com/huggingface/transformers.git
cd transformers
pip install -e .
pip install torch # or tensorflow, jax
python -c "from transformers import AutoTokenizer, AutoModel; print('Ready')"
Daily commands:
No single 'server' โ library is imported. For documentation building: make build-doc (Makefile exists). For testing: pytest tests/ or specific test files. For benchmarking: workflows in .github/workflows/benchmark_v2*.yml define the process.
Map of the codebase
README.mdโ Entry point documentation describing the ๐ค Transformers framework, its purpose, and how to use it for state-of-the-art ML models.CONTRIBUTING.mdโ Defines contribution guidelines, code standards, and workflow that all contributors must follow when submitting PRs to this project..github/PULL_REQUEST_TEMPLATE.mdโ Standardizes PR submissions with required checks, testing, and documentation expectations for every merged change..github/workflows/pr-ci-caller.ymlโ Orchestrates the primary CI/CD pipeline that validates every PR against model tests, type checking, and benchmark regressions.MIGRATION_GUIDE_V5.mdโ Documents breaking changes and migration paths for version 5, essential context for understanding recent architectural shifts..ai/skills/add-or-fix-type-checking/SKILL.mdโ Codifies the type-checking conventions and tools used across the codebase, critical for maintaining code quality standards.Makefileโ Centralizes build, test, and development tasks; documents common dev workflows and command patterns.
Components & responsibilities
- PR CI Pipeline (GitHub Actions, pytest, Mypy, CodeQL) โ Validates code quality, runs model tests across hardware/framework matrix, detects regressions, gates merge
- Failure mode: Flaky benchmark results on shared runners; timeout on slow hardware tests; false-positive type errors requiring skill overrides
- Benchmarking Suite (benchmark/benchmark.py, psutil, gpustat, pandas, optimum-benchmark wrapper) โ Measures model inference speed, memory usage, training throughput; identifies performance regressions before deployment
How to make changes
Add a New Model Implementation
- Check the new-model-addition template to understand requirements (
.github/ISSUE_TEMPLATE/new-model-addition.yml) - Follow the coding standards and type-checking guidelines (
.ai/skills/add-or-fix-type-checking/SKILL.md) - Create model files in the appropriate architecture folder and register in the modeling module (
CONTRIBUTING.md) - Add benchmarks for your model to regression detection suite (
benchmark/config/generation.yaml) - Submit PR using the standard template with test coverage (
.github/PULL_REQUEST_TEMPLATE.md)
Run Local Tests & Validation
- Review available Makefile targets for testing and formatting (
Makefile) - Run type checking to ensure compliance with project standards (
.ai/skills/add-or-fix-type-checking/SKILL.md) - Execute benchmark suite to detect regressions (
benchmark/benchmark.py) - Validate against CI/CD pipeline expectations defined in PR workflow (
.github/workflows/pr-ci-caller.yml)
Understand Migration & Breaking Changes
- Review the v5 migration guide for context on recent architectural shifts (
MIGRATION_GUIDE_V5.md) - Check if your changes align with current API design in README examples (
README.md) - Document any breaking changes in your PR following the template (
.github/PULL_REQUEST_TEMPLATE.md)
Set Up Development Environment
- Install dependencies and configure environment using Makefile targets (
Makefile) - Review conda build configuration if packaging for conda distribution (
.github/conda/meta.yaml) - Consult CONTRIBUTING.md for development setup and workflow (
CONTRIBUTING.md)
Why these technologies
- GitHub Actions Workflows โ Integrated CI/CD native to GitHub; enables matrix testing across GPUs, frameworks (PyTorch/TensorFlow/JAX), and Python versions without external infrastructure.
- CircleCI (supplementary) โ Legacy CI system; maintained for backward compatibility and specialized workloads (self-hosted GPU runners) not fully migrated to Actions.
- Makefile โ Standardizes dev commands (test, lint, format, install) across platforms; reduces onboarding friction for contributors.
- Benchmarking Framework (v1 & v2) โ Detects performance regressions early; v2 (continuous batching) targets inference optimization for production workloads.
Trade-offs already made
-
Large monorepo (~600 files) housing multiple model architectures
- Why: Single source of truth for model implementations; unified testing and documentation; easier adoption for users.
- Consequence: PR review complexity; CI runtime can exceed 30m for full matrix; requires disciplined code organization to prevent circular dependencies.
-
Type checking and linting as hard requirements in CI
- Why: Catch bugs early; improve IDE experience; maintain consistent code style across 100+ contributors.
- Consequence: Slower development iteration; occasional false positives requiring skill-based exceptions (defined in .ai/skills).
-
Benchmarking integrated into CI/CD rather than post-deployment
- Why: Prevent performance regressions from merging; reduce customer impact in production.
- Consequence: Extended PR cycles; benchmark variability on shared runners can cause flaky results.
Non-goals (don't propose these)
- Real-time inference serving (library is for model definitions and training; deployment to production is out-of-scope)
- Distributed training orchestration (relies on PyTorch/TensorFlow ecosystems; does not manage cluster management)
- Proprietary/closed-source model support (community-driven; prioritizes open models)
Traps & gotchas
Dependency fragmentation: Code must support both PyTorch and TensorFlow; TF compatibility breaks silently if not tested. Hub integration required: Many examples assume huggingface_hub is configured with valid credentials for private model downloads. Version pinning: transformers depends on specific tokenizers library versions; pip install without pinning can break tokenization. Model download caching: By default downloads to ~/.cache/huggingface/hub/; disk space can fill unexpectedly with large models (175B+ parameter models). Distributed training complexity: Trainer abstracts away distributed details but device_map / torch_distributed_launch setup is error-prone. No model validation on commit: Community-contributed models (.github/ISSUE_TEMPLATE/new-model-addition.yml) don't block on correctness verification, so some models may have numerical accuracy issues.
Architecture
Concepts to learn
- Attention Mechanism (Scaled Dot-Product) โ Core computation in all transformer models implemented in this library; understanding attention is necessary to interpret model behavior and optimize forward pass performance
- Tokenization (BPE, WordPiece, SentencePiece) โ transformers abstracts tokenizer selection (via AutoTokenizer) but different models require different tokenization schemes; mismatched tokenizer/model causes silent accuracy drops
- Config-as-Code Pattern (PretrainedConfig) โ Models are decoupled into Config (hyperparameters) and Model (weights) classes, enabling reproducible model loading and hyperparameter sweeps without code changes
- Mixed Precision Training (FP16 / BFloat16) โ Trainer class implements automatic mixed precision (Torch AMP, TF mixed_float16 policy) for memory efficiency; understanding numeric stability is critical for fine-tuning large models
- Distributed Data Parallelism (DistributedDataParallel, DeepSpeed) โ Trainer orchestrates multi-GPU/multi-node training via torch.nn.parallel.DistributedDataParallel and integration with DeepSpeed; understanding device placement avoids silent OOM errors
- Model Quantization (Dynamic/Static, QAT) โ Post-training quantization reduces model size for inference; transformers supports int8 quantization via bitsandbytes, critical for deploying billion-parameter models
- Gradient Checkpointing (Activation Recomputation) โ Trainer can enable gradient_checkpointing=True to trade compute for memory during fine-tuning; essential for fitting large models on limited VRAM
Related repos
huggingface/datasetsโ Companion library for loading and preprocessing datasets (ImageNet, GLUE, COCO) used with transformers models for training/evaluationhuggingface/huggingface_hubโ Client library handling authentication, model/dataset download, and Hub API integration that transformers depends on for model discoverypytorch/pytorchโ Primary backend framework; transformers abstracts PyTorch's nn.Module API but users must understand torch.cuda, autograd for debuggingtensorflow/tensorflowโ Alternative backend for transformers models; code maintains TensorFlow parity but not all features are available in both backendsopenai/gpt-2โ Inspiration/predecessor โ early transformer implementation that influenced transformers' design for simplifying model loading and fine-tuning
PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive type checking validation workflow for model implementations
The repo has a .ai/skills/add-or-fix-type-checking/SKILL.md file indicating type checking is a priority, but there's no dedicated GitHub Action workflow to enforce type hints across model files. With 100+ model implementations in src/transformers/models/, adding a workflow that validates type annotations on PRs would catch issues early and maintain code quality standards.
- [ ] Review
.ai/skills/add-or-fix-type-checking/SKILL.mdto understand type-checking standards - [ ] Create
.github/workflows/type-check-models.ymlthat runsmypyorpyrightonsrc/transformers/models/directory - [ ] Configure the workflow to fail on missing type hints in model forward methods and key public APIs
- [ ] Add documentation in
.github/workflows/TROUBLESHOOT.mdexplaining how to fix type-check failures locally
Implement integration tests for cross-framework model loading (PyTorch โ TensorFlow โ JAX)
The transformers library supports multiple frameworks but there's no visible comprehensive integration test suite validating that a model trained/saved in one framework loads and runs correctly in another. This is critical for users migrating between frameworks. Current structure shows model_jobs.yml and model_jobs_intel_gaudi.yml but no framework-interop tests.
- [ ] Create
.github/workflows/cross-framework-integration.ymlworkflow - [ ] Add test suite in
tests/cross_framework_tests/with tests for loading PyTorch models in TensorFlow, JAX, and vice versa - [ ] Include tests for weights conversion, output shape validation, and numerical stability across frameworks
- [ ] Document framework compatibility matrix in
README.mdbased on test results
Add automated documentation generation and validation for new model additions
With .github/workflows/add-model-like.yml existing and .github/ISSUE_TEMPLATE/new-model-addition.yml in place, there's infrastructure for model PRs but no automated validation that new models include required documentation. This causes incomplete model cards and missing examples in the Hub.
- [ ] Create
.github/workflows/validate-new-model-docs.ymlthat triggers onnew-model-additionPRs - [ ] Add checks for: model card completeness, example usage in docstrings, paper link in config, and task compatibility tags
- [ ] Implement a script in
.github/scripts/validate_model_docs.pyto parse model files and verify documentation requirements - [ ] Add failure feedback that links to documentation template in
.ai/or create one if missing
Good first issues
- Add type hints to src/transformers/pipelines/ โ currently missing type annotations for the pipeline factory functions and return types, matching .ai/skills/add-or-fix-type-checking effort
- Implement doctest coverage for src/transformers/configuration_utils.py โ .github/workflows/doctests.yml exists but many Config classes lack runnable examples showing how to instantiate and modify hyperparameters
- Extend tests/models/tiny_model_test.py to cover BFloat16 precision โ .github/workflows/check_tiny_models.yml validates model loading but doesn't test mixed-precision variants used in production
Top contributors
- @Cyrilvallez โ 7 commits
- @stevhliu โ 7 commits
- @ydshieh โ 6 commits
- @vasqu โ 6 commits
- @tarekziade โ 5 commits
Recent commits
a6ccf93โ Fix CI: Allow more artifacts to be download in CI (#45785) (ydshieh)2c432d7โ AddconcurrencytoPR CIworkflow file (pr-ci-caller.yml) (#45786) (ydshieh)3db570fโ Reorder decorators for autodoc and dataclass (#45702) (zucchini-nlp)136befeโ Unwraptext_configinAutoModelFor*.from_config(#45770) (jamesbraza)ffd36edโ deepseek r1 distilled tokenizer fix for qwen2 mapping (#45741) (itazap)d379ac1โ fix: Added Mps support in float fallback backends list (#45687) (rigen1048)d63bb4aโ Github Actions PR CI (caller) (#45476) (ydshieh)a5b83a7โ Add EXAONE 4.5 implementations (#45471) (nuxlear)8c004ecโ make sure we call check_auto in CI (#45775) (tarekziade)6f90cbbโ Better Grouped GEMM + EP (#45621) (IlyasMoutawwakil)
Security observations
The transformers codebase has several security concerns primarily around dependency management and outdated packages. The most critical issues are outdated versions of psutil and psycopg2 with known vulnerabilities, and inconsistent version pinning strategies. Additionally, the framework's support for loading arbitrary model formats (particularly pickle) from remote sources presents a significant remote code execution risk, though the documentation acknowledges this and recommends safetensors. The loose pandas constraint could introduce unexpected behavior. Overall security posture requires immediate attention to dependency updates and stricter default security boundaries for model loading.
- High ยท Outdated psutil dependency with known vulnerabilities โ
Dependencies/Package file - psutil==6.0.0. psutil==6.0.0 is an older version with multiple known CVEs including CVE-2021-41056 and others related to process handling and privilege escalation. Current version is 6.x with patches beyond 6.0.0. Fix: Update to psutil>=6.1.0 or latest stable version. Run 'pip audit' to identify specific CVEs and update accordingly. - High ยท Outdated psycopg2 dependency โ
Dependencies/Package file - psycopg2==2.9.9. psycopg2==2.9.9 is outdated. This PostgreSQL adapter has had security updates since this version. Using outdated database drivers exposes the application to SQL injection and connection security risks. Fix: Update to psycopg2>=2.9.10 or the latest 3.x version (psycopg3). Review the changelog for security patches. - Medium ยท Loose pandas version constraint โ
Dependencies/Package file - pandas>=1.5.0. pandas>=1.5.0 uses a loose lower bound without an upper bound, which could allow installation of future versions with breaking changes or security issues. This could lead to unexpected behavior in production. Fix: Define a tighter version constraint like pandas>=1.5.0,<3.0 or pandas>=1.5.0,<2.1.0 depending on compatibility requirements. - Medium ยท Remote code execution risk via model loading โ
SECURITY.md - Remote artefacts section. The SECURITY.md mentions the framework's tight coupling with Hugging Face Hub and the ability to download remote artifacts. While safetensors format is recommended, the framework still supports loading from pickle and other unsafe formats that can execute arbitrary code. Fix: Implement strict validation of model sources, enable safetensors-only mode by default, and require explicit opt-in for unsafe formats. Add warning messages for pickle and other dangerous formats. - Medium ยท gpustat pinned to older version โ
Dependencies/Package file - gpustat==1.1.1. gpustat==1.1.1 is pinned to a specific older version without flexibility for security patches. If vulnerabilities are found in dependencies of gpustat, they cannot be easily updated. Fix: Update to gpustat>=1.1.1 with a reasonable upper bound, or investigate if a newer major version is available and compatible. - Low ยท Missing dependency pinning strategy โ
Dependencies/Package file. The dependency file shows inconsistent pinning strategies - some packages are pinned exactly (==) while others use loose constraints (>=). This inconsistency could lead to reproducibility issues and unexpected updates in production. Fix: Implement a consistent dependency management strategy using either exact pinning with a separate constraints file for flexibility, or use lock files (requirements.lock, poetry.lock) for reproducible installations. - Low ยท Broad GitHub Actions workflow permissions not explicitly scoped โ
.github/workflows/. Multiple GitHub Actions workflows are present (.github/workflows/) but without reviewing their content, there's potential for overly permissive GITHUB_TOKEN permissions that could allow unintended access. Fix: Review all workflow files and add explicit 'permissions' sections with minimal required scopes. Use 'contents: read' by default and only escalate when necessary.
LLM-derived; treat as a starting point, not a security audit.
Where to read next
- Open issues โ current backlog
- Recent PRs โ what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals โ see the live page for receipts. Re-run on a new commit to refresh.