RepoPilotOpen in app →

wang-xinyu/tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

Healthy

Healthy across the board

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 2mo ago
  • 19 active contributors
  • Distributed ownership (top contributor 37% of recent commits)
Show 3 more →
  • MIT licensed
  • CI configured
  • Tests present

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Healthy
[![RepoPilot: Healthy](https://repopilot.app/api/badge/wang-xinyu/tensorrtx)](https://repopilot.app/r/wang-xinyu/tensorrtx)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/wang-xinyu/tensorrtx on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: wang-xinyu/tensorrtx

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/wang-xinyu/tensorrtx shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across the board

  • Last commit 2mo ago
  • 19 active contributors
  • Distributed ownership (top contributor 37% of recent commits)
  • MIT licensed
  • CI configured
  • Tests present

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live wang-xinyu/tensorrtx repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/wang-xinyu/tensorrtx.

What it runs against: a local clone of wang-xinyu/tensorrtx — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in wang-xinyu/tensorrtx | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | Last commit ≤ 93 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>wang-xinyu/tensorrtx</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of wang-xinyu/tensorrtx. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/wang-xinyu/tensorrtx.git
#   cd tensorrtx
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of wang-xinyu/tensorrtx and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "wang-xinyu/tensorrtx(\\.git)?\\b" \\
  && ok "origin remote is wang-xinyu/tensorrtx" \\
  || miss "origin remote is not wang-xinyu/tensorrtx (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 93 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~63d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/wang-xinyu/tensorrtx"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

TensorRTx implements popular deep learning network architectures (ResNet, YOLO, EfficientNet, Vision Transformer, etc.) using NVIDIA TensorRT's native C++ network definition API instead of ONNX/UFF parsers. It exports PyTorch/TensorFlow weights to .wts plaintext files, then builds optimized TensorRT inference engines for production deployment with full network introspection and custom layer control. Flat monorepo: each deep learning model gets its own top-level folder (alexnet/, yolo5/, centernet/, convnextv2/, etc.), each containing {CMakeLists.txt, gen_wts.py, inference .cpp/.py, logging.h, utils.h}. Shared patterns: Python script exports weights → .wts file; C++ code loads .wts, builds TensorRT INetworkDefinition, serializes engine. Custom CUDA kernels live in dedicated plugin subdirectories (dcnv2Plugin/*, convnextv2/src/).

👥Who it's for

Machine learning engineers and inference optimization specialists who need production-ready TensorRT engines with architectural transparency, custom layer modifications, and integrated pre/post-processing—particularly those deploying computer vision models (object detection, face recognition, image classification) on NVIDIA GPUs at scale.

🌱Maturity & risk

Actively maintained as of Jan 2026 (recent YOLO13/Vision Transformer additions, TensorRT 7-10 SDK support). 4.3M lines of C++, 745K Python, and established CI/pre-commit pipelines indicate production-grade code. However, it's a community reference implementation repo, not an official NVIDIA library, with variable quality across 100+ network implementations.

High fragmentation risk: 100+ independent network folders (yolo*, efficientnet, arcface, etc.) with inconsistent patterns and potential API skew across TensorRT SDK versions 7-10. Single-maintainer core (wang-xinyu) with community PRs of variable quality. No comprehensive test suite visible; each network folder has ad-hoc validation. CUDA custom plugins (dcnv2Plugin, LayerNormPlugin) add compilation complexity.

Active areas of work

Recent activity (Jan-Mar 2026): YOLOv13/YOLO12/YOLO11 variants added; Vision Transformer implementation; refactor of legacy CV models to support TensorRT SDK 7-10 uniformly; first Tripy (TensorRT Python) examples for LeNet. Heavy focus on YOLO variants and object detection; convnextv2 and arcface actively refined.

🚀Get running

git clone https://github.com/wang-xinyu/tensorrtx.git && cd tensorrtx && pip install opencv-python-headless numpy torch nvtripy. Then cd into a specific model folder (e.g., yolo5/) and follow its README.md for PyTorch weight export (python gen_wts.py) and TensorRT engine build (mkdir build && cd build && cmake .. && make && ./yolo).

Daily commands: Model-specific; pick a folder like yolo5/: (1) python gen_wts.py to export PyTorch weights to yolo5.wts, (2) mkdir build && cd build && cmake .. && make to compile the C++ engine builder, (3) ./yolo -s ../yolo5.wts yolo5.engine m to serialize TensorRT engine, (4) ./yolo -d yolo5.engine ../sample/input.jpg to run inference. See individual README.md files for exact parameters.

🗺️Map of the codebase

  • alexnet/alexnet.cc: Simplest complete example: demonstrates full workflow from loading .wts weights to building INetworkDefinition to serializing TensorRT engine—ideal onboarding reference.
  • alexnet/gen_wts.py: Template for exporting PyTorch model weights to plaintext .wts format; shows OrderedDict iteration and binary serialization pattern used across all models.
  • yolo5/yolo.cc: Production-scale example with post-processing (NMS), batched inference, and multi-GPU support; demonstrates IExecutionContext usage and tensor marshaling.
  • convnextv2/src/LayerNormPlugin.h: Reference custom CUDA plugin interface; shows how to wrap unsupported layers (LayerNorm) via IPlugin API for ops not in TensorRT native set.
  • centernet/dcnv2Plugin/dcnv2Plugin.cpp: Complex CUDA plugin (Deformable Convolution v2); demonstrates kernel binding, serialization, and multi-GPU plugin lifecycle.
  • CMakeLists.txt: Boilerplate CMake configuration for TensorRT, CUDA, OpenCV discovery and linking; copy-paste template for new models.

🛠️How to make changes

Start with a reference model folder (alexnet/ or yolo5/ are simplest). (1) Modify gen_wts.py to export different PyTorch model variants. (2) Edit the .cc/.cpp file's network building code (around ILayer* layer = ...) to add/remove/modify layers. (3) Update CMakeLists.txt if new .cu files added. (4) For custom CUDA ops, create a new plugin directory (see convnextv2/src/LayerNormPlugin.cu as template) and register in the main .cpp via IPluginCreator.

🪤Traps & gotchas

(1) .wts weight file format is binary (not text despite .txt name); gen_wts.py must match the .cc file's expected tensor order exactly or inference silently produces garbage. (2) TensorRT engine files (.engine) are GPU-architecture-specific; building on RTX3090 won't run on A100 without rebuild. (3) Some model folders have stale code; branch to trt10 for TensorRT 10.x compatibility. (4) Custom CUDA plugins require matching CUDA Compute Capability (SM_75, SM_80, etc.) set in CMakeLists.txt; wrong value silently fails at runtime. (5) ONNX/UFF parsers intentionally NOT used—this repo assumes you already have .wts weights; if you only have .onnx, you need external conversion tooling.

💡Concepts to learn

  • NVIDIA/TensorRT — Official NVIDIA TensorRT repository; TensorRTx builds directly on its C++ API and serves as a reference implementation library for that API.
  • onnx/onnx-tensorrt — ONNX parser for TensorRT—the alternative approach that TensorRTx explicitly avoids; understand this to appreciate TensorRTx's flexibility tradeoff.
  • wang-xinyu/pytorchx — Sister repo by same author; provides PyTorch implementations of networks exported by gen_wts.py scripts in TensorRTx; tight coupling for weight extraction pipeline.
  • Ultralytics/yolov5 — Official YOLOv5 PyTorch repo; TensorRTx's yolo5/ folder depends on this for model definitions and weight export baseline.
  • NVIDIA/TensorRT-Incubator — Newer TensorRT experimentation (Tripy, nvtripy package); TensorRTx is adopting Tripy for future Python-native model definitions as seen in recent LeNet example.

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add CMake integration tests for model build workflows

The repo has CMakeLists.txt files across multiple model directories (alexnet, arcface, convnextv2, crnn, csrnet, dbnet, etc.) but lacks automated CI validation that these builds actually succeed. Currently, .github/workflows/pre-commit.yml only checks formatting. Adding a workflow to test CMake builds for at least 2-3 representative models (small ones like alexnet and crnn to keep CI fast) would catch build breakages early. This is high-value because contributors regularly modify C++ code and CMake configs, and build failures can go unnoticed until deployment.

  • [ ] Create .github/workflows/cmake-build-test.yml to test CMake builds
  • [ ] Test at minimum: alexnet/CMakeLists.txt, crnn/CMakeLists.txt, convnextv2/CMakeLists.txt builds
  • [ ] Verify the workflow runs on pull requests and reports failures clearly
  • [ ] Document in contributing.md which models are tested in CI and why

Consolidate and document custom CUDA plugins (LayerNormPlugin, dcnv2Plugin, prelu)

The repo has custom CUDA kernels scattered across different model directories: convnextv2/src/LayerNormPlugin.cu, centernet/dcnv2Plugin/dcn_v2_im2col_cuda.cu, and arcface/prelu.cu. These are duplicated or specialized implementations that could benefit from centralized documentation and a shared plugin base class pattern. Create a new plugins/ directory with documented plugin templates and refactor existing plugins to follow consistent structure. This reduces maintenance burden and makes it easier for contributors to add new custom layers.

  • [ ] Create plugins/ directory with base plugin class and CMakeLists.txt
  • [ ] Document plugin interface requirements in plugins/README.md with examples from LayerNormPlugin and dcnv2Plugin
  • [ ] Refactor convnextv2/src/LayerNormPlugin.* and centernet/dcnv2Plugin/* to inherit from base class
  • [ ] Update arcface/prelu.cu to follow the same pattern for consistency

Add weight file format documentation and validation utility

Multiple model directories have gen_wts.py scripts (alexnet/gen_wts.py, arcface/gen_wts.py, csrnet/gen_wts.py, etc.) that export weights to .wts files, but the binary format is undocumented. This creates friction for contributors who need to understand or modify weight export/import. Create a utils/wts_format.md documenting the .wts binary format, and add a python validation script (utils/validate_wts.py) that can verify .wts file integrity and dump metadata. This reduces debugging time when weight loading fails.

  • [ ] Create utils/wts_format.md documenting the .wts file binary structure (header, data types, offsets, checksums if any)
  • [ ] Create utils/validate_wts.py script that can read/validate .wts files and report statistics
  • [ ] Test validate_wts.py against existing generated .wts files from at least 2 models (alexnet, arcface)
  • [ ] Update relevant model READMEs to reference the format documentation

🌿Good first issues

  • Add unit tests for weight export in gen_wts.py across all model variants (alexnet, yolo5, efficientnet)—currently each folder validates manually; could write Python test cases that verify exported .wts integrity.
  • Document CUDA Compute Capability requirements and verification steps in top-level README.md and each model README—currently unclear which SM versions are supported; add a troubleshooting table.
  • Refactor shared utilities (logging.h, macros.h, utils.h) duplicated in 30+ model folders into a single tensorrtx/common/ directory and update CMakeLists.txt includes—reduces maintenance burden.

Top contributors

Click to expand

📝Recent commits

Click to expand
  • 2990f34 — add Vision Transformer (#1709) (zgjja)
  • 076a8af — fix C++ lint with clang-tidy, improve windows user experience (#1705) (zgjja)
  • e29066e — Yolo26-Cls Added (#1704) (fazligorkembal)
  • 664f222 — Update README.md (#1703) (fazligorkembal)
  • 5390258 — Yolo26 detection and obb task support (#1701) (fazligorkembal)
  • 3276610 — Refactor multiple old CV models (#1688) (zgjja)
  • 00d67c0 — add the support of ConvNextV2 in TensorRT8 (#1689) (daydreamer0521)
  • 7886fb3 — style: fix clang-format and end-of-file issues in yolov13 (#1687) (ydk61)
  • b87f66a — Update config.h (lindsayshuo)
  • 8bf7a8f — Update README.md (lindsayshuo)

🔒Security observations

  • High · Unversioned Dependencies in requirements — Dependencies/Package file content. The dependencies file uses unpinned version specifiers (e.g., 'nvtripy>=0.1.1', 'opencv-python-headless', 'numpy', 'torch'). This allows installation of any newer version, including those with security vulnerabilities or breaking changes. The use of '>=' without upper bounds increases supply chain risk. Fix: Pin all dependencies to specific versions using '==' operator. Example: 'nvtripy==0.1.1', 'opencv-python-headless==4.8.0.74', 'numpy==1.24.3', 'torch==2.0.1'. Implement dependency scanning tools like pip-audit or safety to detect known vulnerabilities.
  • High · Custom Package Index Without Verification — Dependencies/Package file content. The requirements file specifies a custom package index (-f https://nvidia.github.io/TensorRT-Incubator/packages.html) without HTTPS certificate pinning or package signature verification. This increases vulnerability to man-in-the-middle attacks or compromised package index attacks. Fix: Use PEP 440 compatible package URLs with hash verification. Implement pip's '--require-hashes' flag. Example: 'nvtripy==0.1.1 --hash=sha256:...' Verify NVIDIA's certificate pinning and consider using a private package mirror with access controls.
  • Medium · Missing Security Policy Documentation — Repository root. No SECURITY.md or security policy document found in the repository. This makes it difficult for security researchers to responsibly disclose vulnerabilities. Fix: Create a SECURITY.md file following the GitHub security policy template. Include responsible disclosure guidelines, contact information, and security update procedures.
  • Medium · CUDA Plugin Code Lacks Input Validation — centernet/dcnv2Plugin/, convnextv2/src/LayerNormPlugin.cu, arcface/prelu.cu. Multiple CUDA plugin files (.cu files) are present but file structure suggests potential for unvalidated user input processing in compute kernels. Without visible bounds checking in dcnv2Plugin and LayerNormPlugin, buffer overflow or memory corruption vulnerabilities may exist. Fix: Implement strict input validation for all plugin parameters. Add bounds checking for tensor dimensions and sizes. Conduct memory safety analysis using tools like AddressSanitizer or MemorySanitizer on CUDA code. Document plugin API contracts and validate assumptions at runtime.
  • Medium · Pre-commit Configuration Exists But Enforcement Unclear — .pre-commit-config.yaml. A .pre-commit-config.yaml file exists, suggesting pre-commit hooks are used. However, without seeing the actual configuration content, enforcement level and completeness of security checks (linting, vulnerability scanning) cannot be verified. Fix: Ensure the pre-commit configuration includes security scanning hooks such as: bandit (Python security), truffleHog (secret detection), semgrep (SAST), clang-analyzer (C++ analysis). Document enforcement in CONTRIBUTING.md and make hooks mandatory through branch protection rules.
  • Medium · Model Weights and Data Files Not Secured — alexnet/, arcface/, centernet/, etc. (gen_wts.py and implicit .wts files). Repository structure suggests storage of model weights (.wts files implied by gen_wts.py scripts across multiple directories). Model poisoning and integrity verification mechanisms are not evident from the file structure. Fix: Implement cryptographic integrity verification for all model weight files using SHA-256 or SHA-512 hashes. Document hash values in a signed CHECKSUMS file. Use digital signatures with GPG for model releases. Add validation in loading code to verify model integrity at runtime.
  • Low · Missing .gitignore Validation — .gitignore. While a .gitignore file exists, without viewing its contents, there's risk that sensitive files (credentials, API keys, temporary files) could be accidentally committed. Fix: Review .gitignore to ensure it includes common sensitive patterns: *.env, *.key, *.pem, *.pth (model files), .onnx, .vscode/settings.json, IDE configuration files. Add rule: '.wts' if weights files are generated locally. Consider using git-secrets or detect-secrets hooks.
  • undefined · undefined — undefined. undefined Fix: undefined

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Healthy signals · wang-xinyu/tensorrtx — RepoPilot