HumanSignal/labelImg
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source data labeling tool for images, text, hypertext, audio, video and time-series data.
Healthy across all four use cases
Permissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ⚠Stale — last commit 2y ago
- ✓44+ active contributors
- ✓Distributed ownership (top contributor 37% of recent commits)
- ✓MIT licensed
- ✓CI configured
- ✓Tests present
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/humansignal/labelimg)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card
This card auto-renders when someone shares https://repopilot.app/r/humansignal/labelimg on X, Slack, or LinkedIn.
Ask AI about humansignal/labelimg
Grounded in the actual source code. Pick a starter question or write your own.
Onboarding doc
Onboarding: HumanSignal/labelImg
Generated by RepoPilot · 2026-06-21 · Source
🎯Verdict
GO — Healthy across all four use cases
- 44+ active contributors
- Distributed ownership (top contributor 37% of recent commits)
- MIT licensed
- CI configured
- Tests present
- ⚠ Stale — last commit 2y ago
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
⚡TL;DR
LabelImg is a PyQt5-based graphical image annotation tool that lets users draw bounding boxes, polygons, and other shapes on images and export annotations in PASCAL VOC, YOLO, or CreateML XML formats. It's now archived and redirects users to Label Studio, but remains a lightweight desktop application for single-image labeling workflows without requiring a backend server. Single-module PyQt5 application: labelImg.py is the entry point, libs/ contains reusable modules (canvas.py for drawing, labelFile.py for I/O, shape.py for geometry, pascal_voc_io.py / yolo_io.py for format exporters). Build scripts in build-tools/ handle platform-specific packaging. No test directory visible in the file structure.
👥Who it's for
Computer vision engineers and dataset annotators who need to manually label object detection or image classification datasets. Users who prefer a standalone desktop tool over web-based annotation platforms, or who work with legacy PASCAL VOC / YOLO format outputs.
🌱Maturity & risk
This project is in maintenance mode / archived. The README explicitly states it is 'no longer actively being developed' and directs users to Label Studio instead. GitHub Actions CI is present (package.yml workflow), but the codebase appears dormant with no recent active commits visible. It is stable and production-grade for its original use case, but new features are not planned.
High risk due to project abandonment: no active maintainer, no planned updates, and dependencies (PyQt5, Python 2/3 compatibility code) may become stale. The Makefile-based build system and shell scripts (build-tools/) suggest tight OS-level dependencies that may break on new OS versions. Python 2 support code is still present despite Python 3 being required, indicating incomplete cleanup.
Active areas of work
Project is in maintenance mode only. The workflow package.yml handles CI/packaging, but active development has stopped. README directs all users to Label Studio. No ongoing feature work or major refactoring is visible.
🚀Get running
git clone https://github.com/HumanSignal/labelImg.git
cd labelImg
sudo apt-get install pyqt5-dev-tools # Ubuntu/Linux
pip3 install -r requirements/requirements-linux-python3.txt
make qt5py3
python3 labelImg.py [IMAGE_PATH] [PREDEFINED_CLASS_FILE]
Daily commands:
make qt5py3 # Compiles Qt resource files
python3 labelImg.py # Launch GUI
# Or with image and class file:
python3 labelImg.py path/to/image.jpg data/predefined_classes.txt
🗺️Map of the codebase
labelImg.py— Main entry point and application controller for the image annotation UI; orchestrates all canvas, file I/O, and toolbar interactions.libs/canvas.py— Core drawing and shape rendering engine; handles all mouse events, shape creation/editing, and visual display of annotations.libs/shape.py— Shape data model abstraction for bounding boxes and polygons; serialized to and from file formats.libs/labelFile.py— Unified file I/O abstraction layer that delegates to format-specific readers (pascal_voc_io, yolo_io, create_ml_io).libs/pascal_voc_io.py— Pascal VOC XML format serialization; primary output format for bounding box annotations.libs/settings.py— Persistent application state and configuration management; stores user preferences and geometry.
🛠️How to make changes
Add a new annotation format (file I/O)
- Create a new format module in libs/ (e.g., libs/my_format_io.py) with load() and save() functions that accept a filename and return/write Shape objects (
libs/my_format_io.py) - Register the format in libs/labelFile.py by adding an import and a condition in the load/save dispatcher methods to detect file extensions (
libs/labelFile.py) - Add corresponding toolbar icon in resources/icons/ and menu action in labelImg.py to expose the export option (
labelImg.py)
Add a new shape type (e.g., point, circle)
- Extend the Shape class in libs/shape.py with new geometry properties and serialization logic for the new shape type (
libs/shape.py) - Add a new shape mode constant in libs/constants.py and update the canvas.py mouse event handlers to draw and manipulate the new shape (
libs/canvas.py) - Update the format I/O modules (pascal_voc_io.py, yolo_io.py, etc.) to read/write the new shape type (
libs/pascal_voc_io.py) - Add a toolbar button in libs/toolBar.py or labelImg.py to select the new shape mode (
libs/toolBar.py)
Add a new UI language/locale
- Create a new strings file resources/strings/strings-<locale>.properties with translated keys from the template (e.g., resources/strings/strings.properties) (
resources/strings/strings-de-DE.properties) - Update libs/stringBundle.py to add the new locale to its language detection logic and load the new properties file on application startup (
libs/stringBundle.py) - Add a menu item in labelImg.py to allow users to switch to the new language at runtime (
labelImg.py)
Add a new canvas editing feature (e.g., shape rotation, resizing constraint)
- Add state tracking in libs/shape.py for the new property (e.g., rotation angle, aspect ratio lock) (
libs/shape.py) - Implement mouse event handlers and drawing logic in libs/canvas.py to compute and apply the transformation during interactive editing (
libs/canvas.py) - Update all format I/O modules to persist and load the new property from their respective output files (
libs/pascal_voc_io.py) - Add a toolbar button, keyboard shortcut, or dialog option in labelImg.py to toggle or configure the new feature (
labelImg.py)
🔧Why these technologies
- PyQt5 — Cross-platform desktop UI framework with strong graphics rendering (QGraphicsView/Scene) and native file dialogs; enables single codebase for Windows, macOS, and Linux.
- XML (Pascal VOC format) — Industry-standard annotation format for object detection benchmarks (PASCAL VOC, COCO); widely supported by ML frameworks and ensures interoperability.
- Python 3 — Rapid development, large ecosystem of ML libraries, easy deployment via PyPI; maintains backward compatibility layer for Python 2 code.
⚖️Trade-offs already made
-
Single monolithic canvas.py for drawing instead of modular shape renderers
- Why: Simplifies event handling and immediate visual feedback; reduces state synchronization complexity.
- Consequence: High cognitive load when adding new shape types; coupling between rendering and event logic makes testing harder.
-
Settings stored in Qt QSettings registry instead of JSON/YAML config files
- Why: Native integration with platform-specific settings storage (Windows registry, macOS plist); no config file parsing needed.
- Consequence: Settings are opaque to users; harder to version control or share across machines; requires platform-specific setup on CI.
-
Format detection by file extension rather than magic bytes or format sniffing
- Why: Fast, deterministic, and explicit user intent via filename.
- Consequence: Brittle if file is misnamed; does not handle mixed or ambiguous formats; requires exact extension match.
-
No built-in project/dataset abstraction; processes one image at a time
- Why: Keeps UI simple and startup time fast; users can batch-process via external scripts.
- Consequence: No cross-image consistency checking; tedious for large datasets; users must manage image batches externally.
🚫Non-goals (don't propose these)
- Multi-user collaborative annotation (no server backend, no real-time sync)
- Automatic annotation or AI-assisted labeling (manual only)
- Video frame annotation (images only)
- 3D object annotation (2D bounding boxes and polygons only)
- User authentication or role-based access control
- Real-time annotation validation or constraint enforcement
🪤Traps & gotchas
- Qt resource compilation required: Must run
make qt5py3before launching labelImg.py, or GUI assets (icons, resource strings) will be missing. 2. OS-specific build scripts: The build-tools/ scripts are tightly coupled to specific platforms (Windows batch, macOS shell, Linux distro package managers); rebuilding from source on an unsupported OS may fail silently. 3. Python 2 legacy code: Code still contains Python 2 compatibility shims (ustr.py, six-like patterns) even though only Python 3 is officially supported; may confuse readers. 4. No automated tests: No tests/ directory present; changes risk breaking existing annotation workflows without CI validation. 5. hardcoded data/predefined_classes.txt path: The app may expect a classes file in a specific location; if working with custom datasets, users must provide the full path explicitly.
🏗️Architecture
💡Concepts to learn
- PASCAL VOC XML format — The primary annotation export format in this repo; understanding the XML schema (object → bndbox → xmin/ymin/xmax/ymax) is essential to reading pascal_voc_io.py and debugging annotation correctness
- Qt resource system (QRC files) — resources.qrc and the Makefile's qt5py3 target compile UI assets into binary; understanding this is required to add icons, strings, or UI elements without breaking the build
- Bounding box representation (min/max corners vs. center+size) — LabelImg uses (xmin, ymin, xmax, ymax) for boxes, while YOLO uses (center_x, center_y, width, height) normalized to [0,1]; yolo_io.py demonstrates the conversion logic critical for format interoperability
- Qt signals and slots (event-driven architecture) — PyQt5 uses signals/slots for UI reactivity; canvas.py and labelImg.py extensively use this pattern (e.g., mouse clicks → drawing) to decouple UI from business logic
- Shape state machine (annotation lifecycle: creating → editing → saved) — shape.py models shape state (selectable, selected, completed) to handle user interaction correctly (e.g., preventing edits to finalized annotations); understanding this is key to modifying drawing behavior
- CreateML format (Apple's annotation schema) — create_ml_io.py exports to Apple CreateML format; shows how LabelImg supports heterogeneous output formats and teaches the pattern for adding new exporters
- Coordinate transformation (canvas zoom and pan) — zoomWidget.py and canvas.py handle viewport-to-image coordinate mapping; critical for pixel-accurate annotation when users zoom in, and for understanding how mouse clicks map to image coordinates
🔗Related repos
heartexlabs/label-studio— The official successor and recommended replacement; web-based multi-modal annotation platform that subsumes LabelImg's functionalityopencv/opencv— Complementary computer vision library; many LabelImg users feed annotations into OpenCV-based training pipelinesroboflow/supervision— Modern Python package for handling bounding box annotations (conversion, filtering, visualization) across YOLO/COCO/VOC formatstzutalin/labelImg— Original upstream repository by Tzutalin before archival; useful for historical reference and understanding design intentultralytics/yolov5— Popular YOLO training framework that natively consumes YOLO-format annotations produced by LabelImg
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add unit tests for labelFile.py and pascal_voc_io.py XML parsing
The repo lacks test coverage for core annotation file I/O operations. labelFile.py and pascal_voc_io.py handle XML parsing and serialization for PASCAL VOC format annotations - critical paths that could have bugs. Adding tests would prevent regressions when these parsers are modified and help validate edge cases like malformed XML, special characters in labels, and boundary conditions for bounding box coordinates.
- [ ] Create tests/test_labelFile.py with test cases for XML loading/saving operations
- [ ] Create tests/test_pascal_voc_io.py with test cases for VOC format parsing and generation
- [ ] Add test fixtures (sample XML files) in tests/fixtures/
- [ ] Test edge cases: empty annotations, special characters in class names, invalid bbox coordinates
- [ ] Add pytest to requirements/requirements-linux-python3.txt
- [ ] Update Makefile with test target (e.g.,
make test)
Create comprehensive GitHub Actions workflow for multi-platform binary builds with artifact retention
The repo has build scripts in build-tools/ (for macOS, Ubuntu, Windows) but .github/workflows/package.yml appears minimal. A complete CI/CD workflow should automatically build and test on Linux, macOS, and Windows on each release, test the PyPI package, and retain artifacts. This reduces manual build overhead and ensures contributors can easily verify builds work across platforms.
- [ ] Expand .github/workflows/package.yml to include matrix builds for ubuntu-latest, macos-latest, windows-latest
- [ ] Add steps to run build-tools/build-ubuntu-binary.sh, build-tools/build-for-macos.sh, build-tools/build-windows-binary.sh
- [ ] Add Python 3.7+ compatibility testing via pytest (once tests are added)
- [ ] Upload built artifacts (.exe, .dmg, .AppImage) as GitHub release assets
- [ ] Add workflow trigger on git tags matching v*.. pattern for releases
- [ ] Document the workflow in CONTRIBUTING.rst with instructions for maintainers
Add integration tests for yolo_io.py and create_ml_io.py export/import formats
The repo supports multiple annotation formats (YOLO, Create ML) via yolo_io.py and create_ml_io.py, but there are no visible tests validating round-trip conversions. Without tests, format changes could silently break exports. Adding integration tests ensures annotations can be loaded, modified, and re-exported correctly across supported formats.
- [ ] Create tests/test_yolo_io.py with round-trip tests (create annotation → export YOLO → import YOLO → verify matches original)
- [ ] Create tests/test_create_ml_io.py with similar round-trip validation
- [ ] Create tests/fixtures/ with sample annotations in each format
- [ ] Test edge cases: multiple objects, class name encoding, coordinate precision
- [ ] Verify exported files match expected YOLO txt format and Create ML XML structure
- [ ] Document expected format specifications in CONTRIBUTING.rst if not already present
🌿Good first issues
- Add unit tests for pascal_voc_io.py: Create tests/test_pascal_voc_io.py with fixtures (sample annotations, edge cases like special characters in labels, malformed XML) to ensure export/import round-trips work. This catches regressions without touching the GUI. 2. Add a Dockerfile and docker-compose.yml: Package the app for headless environments (CI, cloud) to enable screenshot-based testing and distribution without build-tools/ complexity. 3. Document keyboard shortcuts and UI workflows: Create a docs/KEYBOARD_SHORTCUTS.md file and a docs/TUTORIAL.md with screenshots showing annotation workflow (open image → draw box → export → verify XML). The project has no user documentation currently.
⭐Top contributors
Click to expand
Top contributors
- @tzutalin — 37 commits
- @breunigs — 7 commits
- @dependabot[bot] — 4 commits
- @Ledorub — 4 commits
- @chrisrapson — 3 commits
📝Recent commits
Click to expand
Recent commits
b33f965— Readme updates (#950) (lsell)2d5537b— docs: update virtualenv section of readme (#911) (yunatseng)dcc5e23— Bump lxml from 4.6.5 to 4.9.1 in /requirements (#909) (dependabot[bot])5bc7fb9— CreateML fixes (#906) (saitejamalyala)eb603c2— Fix unexpected type 'float' labelImg in labelDialog (tzutalin)1c94399— Fix wrong link for different readme in diffeerent locales (tzutalin)dd6e530— Update build badge (tzutalin)5a3899c— Update README (tzutalin)3a360ad— Revert "Create pylint.yml" (tzutalin)62e0da6— Replace Travis CI with GitHub Actions, and make Windows/Linux builds. (#896) (RyanHir)
🔒Security observations
LabelImg presents moderate security concerns primarily due to its unmaintained status. The most critical issue is that the project is no longer actively developed, meaning security vulnerabilities in dependencies will not be addressed. Additional risks include missing dependency version locking, unreviewed build scripts, and potential input validation issues in file I/O operations. Users should strongly consider migrating to the actively maintained Label Studio fork. If continued use is necessary, a comprehensive security audit and strict dependency management are essential.
- Medium · Archived Project with Unmaintained Dependencies —
Repository root, README.rst, project status. LabelImg is no longer actively maintained and has been superseded by Label Studio. This means security vulnerabilities in dependencies will not be patched. The project readme explicitly states it is 'no longer actively being developed,' creating a significant long-term security risk. Fix: Migrate to Label Studio (https://github.com/heartexlabs/label-studio) which is actively maintained and receives security updates. If continued use of LabelImg is necessary, conduct a thorough security audit of all dependencies and consider forking with security maintenance. - Medium · Missing Dependency Version Lock File —
requirements/ directory, package management. No package lock file (requirements.lock, Pipfile.lock, or similar) is visible in the repository. Only loose requirements files are present in requirements/ directory. This allows for unpredictable dependency versions and potential supply chain vulnerabilities where newer versions of dependencies may contain security issues. Fix: Implement strict version pinning using requirements.lock or Pipfile.lock. Pin all direct and transitive dependencies to specific versions. Use tools like pip-tools or Poetry for reproducible builds. - Medium · Build Scripts Without Security Validation —
build-tools/ directory scripts. Build scripts in build-tools/ directory (build-for-macos.sh, build-windows-binary.sh, etc.) are visible but cannot be analyzed for content. These scripts often execute arbitrary commands and could pose security risks if they contain unvalidated inputs or unsafe operations. Fix: Review all build scripts for: secure shell practices, input validation, use of shellcheck validation, code signing verification, and absence of hardcoded credentials. Implement script signing and integrity verification. - Low · No Security Headers or Configuration Visible —
Repository root. No evidence of security configuration files (.env.example, security.txt, SECURITY.md) or documented security practices in the repository. This makes it difficult for users to understand security considerations when using the tool. Fix: Create SECURITY.md documenting security practices, known vulnerabilities, and responsible disclosure process. Add .env.example for configuration with security best practices. - Low · Potential Path Traversal in File Operations —
libs/pascal_voc_io.py, libs/yolo_io.py, libs/create_ml_io.py, libs/labelFile.py. File I/O operations are performed by several modules (pascal_voc_io.py, yolo_io.py, create_ml_io.py, labelFile.py) but content cannot be analyzed. These modules handle user-provided file paths and could be vulnerable to path traversal attacks if input validation is insufficient. Fix: Implement strict input validation for all file paths. Use os.path.abspath() and verify the canonical path is within expected directories. Reject paths containing '..' or absolute paths outside permitted directories. - Low · XML Processing Risk in Label File Handling —
libs/labelFile.py. The labelFile.py module likely processes XML files (common format for annotation storage). XML parsing without proper configuration could be vulnerable to XXE (XML External Entity) attacks. Fix: Disable XML external entity processing in all XML parsers. Use defusedxml library instead of standard xml library. Configure parsers with: XMLParser(resolve_entities=False, remove_blank_text=True)
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/HumanSignal/labelImg shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live HumanSignal/labelImg
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/HumanSignal/labelImg.
What it runs against: a local clone of HumanSignal/labelImg — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in HumanSignal/labelImg | Confirms the artifact applies here, not a fork |
| 2 | License is still MIT | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 730 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of HumanSignal/labelImg. If you don't
# have one yet, run these first:
#
# git clone https://github.com/HumanSignal/labelImg.git
# cd labelImg
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of HumanSignal/labelImg and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "HumanSignal/labelImg(\\.git)?\\b" \\
&& ok "origin remote is HumanSignal/labelImg" \\
|| miss "origin remote is not HumanSignal/labelImg (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
&& ok "license is MIT" \\
|| miss "license drift — was MIT at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "labelImg.py" \\
&& ok "labelImg.py" \\
|| miss "missing critical file: labelImg.py"
test -f "libs/canvas.py" \\
&& ok "libs/canvas.py" \\
|| miss "missing critical file: libs/canvas.py"
test -f "libs/shape.py" \\
&& ok "libs/shape.py" \\
|| miss "missing critical file: libs/shape.py"
test -f "libs/labelFile.py" \\
&& ok "libs/labelFile.py" \\
|| miss "missing critical file: libs/labelFile.py"
test -f "libs/pascal_voc_io.py" \\
&& ok "libs/pascal_voc_io.py" \\
|| miss "missing critical file: libs/pascal_voc_io.py"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 730 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~700d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/HumanSignal/labelImg"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.
Embed this chat in your README →
Drop this iframe anywhere — the widget runs against the same live analysis cache as the main app.
<iframe src="https://repopilot.app/embed/humansignal/labelimg" width="100%" height="500" style="border:1px solid #d0d7de; border-radius:8px;" allow="microphone" loading="lazy" ></iframe>