huggingface/text-embeddings-inference
A blazing fast inference solution for text embeddings models
Healthy across the board
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 1w ago
- ✓23+ active contributors
- ✓Distributed ownership (top contributor 45% of recent commits)
Show all 6 evidence items →Show less
- ✓Apache-2.0 licensed
- ✓CI configured
- ✓Tests present
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/huggingface/text-embeddings-inference)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/huggingface/text-embeddings-inference on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: huggingface/text-embeddings-inference
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/huggingface/text-embeddings-inference shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across the board
- Last commit 1w ago
- 23+ active contributors
- Distributed ownership (top contributor 45% of recent commits)
- Apache-2.0 licensed
- CI configured
- Tests present
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live huggingface/text-embeddings-inference
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/huggingface/text-embeddings-inference.
What it runs against: a local clone of huggingface/text-embeddings-inference — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in huggingface/text-embeddings-inference | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 37 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of huggingface/text-embeddings-inference. If you don't
# have one yet, run these first:
#
# git clone https://github.com/huggingface/text-embeddings-inference.git
# cd text-embeddings-inference
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of huggingface/text-embeddings-inference and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "huggingface/text-embeddings-inference(\\.git)?\\b" \\
&& ok "origin remote is huggingface/text-embeddings-inference" \\
|| miss "origin remote is not huggingface/text-embeddings-inference (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "Cargo.toml" \\
&& ok "Cargo.toml" \\
|| miss "missing critical file: Cargo.toml"
test -f "backends/candle/src/lib.rs" \\
&& ok "backends/candle/src/lib.rs" \\
|| miss "missing critical file: backends/candle/src/lib.rs"
test -f "backends/candle/src/models/mod.rs" \\
&& ok "backends/candle/src/models/mod.rs" \\
|| miss "missing critical file: backends/candle/src/models/mod.rs"
test -f "router/src/main.rs" \\
&& ok "router/src/main.rs" \\
|| miss "missing critical file: router/src/main.rs"
test -f "backends/core/src/lib.rs" \\
&& ok "backends/core/src/lib.rs" \\
|| miss "missing critical file: backends/core/src/lib.rs"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 37 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~7d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/huggingface/text-embeddings-inference"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
Text Embeddings Inference (TEI) is a production-grade Rust runtime for deploying and serving open-source text embedding models (BERT, E5, GTE, FlagEmbedding, etc.) with extreme performance optimizations. It combines Candle (HuggingFace's ML framework), Flash Attention, cuBLASLt, and token-based dynamic batching to achieve 10-100x throughput gains over standard inference frameworks, with support for CPU, GPU (CUDA/ROCm), and Apple Silicon via Metal. Monorepo (Cargo workspace) with clear layering: backends/ houses pluggable inference engines (Candle in backends/candle/src/models/, ONNX in backends/ort/), core/ contains the core embedding trait and model abstraction, router/ implements the HTTP server and token batching logic, and backends/grpc-client/ provides async gRPC bindings. Each backend has dedicated layer implementations (Flash Attention, RoPE rotary embeddings, cuBLASLt linear layers) under backends/candle/src/layers/.
👥Who it's for
ML infrastructure engineers and SREs deploying embedding models at scale (RAG systems, semantic search, reranking pipelines) who need sub-100ms latencies, serverless-friendly cold starts, and minimal resource footprint. Also HuggingFace ecosystem contributors extending model support.
🌱Maturity & risk
Production-ready and actively maintained. The repo has 2.7k+ GitHub stars, a clean CI/CD pipeline (linting, integration tests, matrix builds across CUDA/ROCm/CPU), comprehensive Docker multi-architecture support (x86-64, ARM64, CUDA variants), and is versioned at 1.9.3 in a workspace with 8 coordinated crates. Last activity visible in workflows indicates ongoing work.
Moderate risk: large Rust codebase (1M+ LoC) with deep CUDA/numerical dependencies (cudarc, Flash Attention, cublaslt) that require exact version alignment—breaking changes in CUDA driver or Candle upstream could cascade. Single primary org (HuggingFace) for maintenance, though community contributions are welcome. Heavy dependency on cutting-edge optimization libraries means experimental features (ROCm, Intel MKL) may lag behind CUDA stability.
Active areas of work
Active development on model support expansion (recent issues note new-model-addition process), optimization of inference pipelines (likely ALiBi and RoPE position encoding variants), and multi-backend parity (ONNX and Candle backends). CI workflows (build.yaml, integration-test.yaml) suggest continuous testing across hardware targets. Docker image variants (CUDA, ARM64, Intel) are being maintained.
🚀Get running
Clone and build with Rust cargo: git clone https://github.com/huggingface/text-embeddings-inference && cd text-embeddings-inference && cargo build --release. Or use Docker: docker run ghcr.io/huggingface/text-embeddings-inference:cpu (see Dockerfile for runtime args). For development: make targets are available (check Makefile for lint/test/build recipes).
Daily commands:
Production: docker run -p 8080:8080 ghcr.io/huggingface/text-embeddings-inference:latest-gpu --model-id BAAI/bge-base-en-v1.5. Local dev: cargo run -p router --release -- --model-id BAAI/bge-base-en-v1.5. Swagger API docs available at http://localhost:8080/docs (check .github/workflows/build_documentation.yml for doc generation).
🗺️Map of the codebase
Cargo.toml— Workspace root configuration defining all members (backends, router, core) and shared dependencies; critical for understanding project structure and build configurationbackends/candle/src/lib.rs— Entry point for the Candle backend implementation; exports model loading and inference logic that powers the core embedding pipelinebackends/candle/src/models/mod.rs— Model registry and factory pattern implementation; orchestrates which model type (BERT, Jina, Mistral, etc.) to instantiate based on configrouter/src/main.rs— HTTP server entry point; defines the REST API routes, request/response handling, and integration between router and inference backendsbackends/core/src/lib.rs— Shared abstractions and traits across all backends (embeddings, pooling strategies, reranking); defines the contract between backends and routerbackends/candle/src/layers/mod.rs— Optimized layer implementations (Flash Attention, layer normalization, rotary embeddings); performance-critical CUDA/CPU optimizationsREADME.md— Project overview, supported models list, deployment guides, and performance benchmarks; essential context for new contributors
🛠️How to make changes
Add a New Transformer Model
- Create new model implementation file in backends/candle/src/models/ (e.g., flash_llama.rs) with struct implementing model architecture and forward pass (
backends/candle/src/models/flash_llama.rs) - Add model variant to the Model enum in mod.rs and implement load() factory method to instantiate from safetensors weights (
backends/candle/src/models/mod.rs) - Implement forward() method calling model's encode() to produce embeddings, then apply pooling strategy (mean, max, CLS token, etc.) (
backends/candle/src/models/flash_llama.rs) - Add integration test in backends/candle/tests/ that loads model, runs batch inference, and validates snapshot against expected outputs (
backends/candle/tests/common.rs) - Update README.md Supported Models section and optionally add model-specific Docker build flags if requiring custom CUDA compute capabilities (
README.md)
Add a New Pooling Strategy
- Define new PoolingStrategy variant in backends/core/src/lib.rs (e.g., WeightedSum, Attention-weighted pooling) (
backends/core/src/lib.rs) - Implement match arm in each model's forward() or in a shared pooling module that applies strategy to hidden states (shape: [seq_len, hidden_size]) (
backends/candle/src/models/mod.rs) - Add request parameter in router to allow clients to specify pooling strategy via API query param or request body (
router/src/main.rs) - Add test cases verifying pooling produces correct output shape [batch_size, hidden_size] across different sequence lengths (
backends/candle/tests/common.rs)
Optimize a Layer with Flash Attention or CUDA Kernel
- Create new CUDA kernel or integrate existing optimization in backends/candle/src/layers/ (e.g., layers/fused_attn.rs) (
backends/candle/src/layers/fused_attn.rs) - Wrap kernel in a Rust function taking Tensor inputs and returning Tensor outputs; add #[cfg(feature = "cuda")] guards if CUDA-specific (
backends/candle/src/layers/fused_attn.rs) - Integrate into model forward pass (e.g., flash_bert.rs) by calling optimized layer instead of standard Candle operations (
backends/candle/src/models/flash_bert.rs) - Update build.rs to compile kernel with nvcc if needed; add feature flag in Cargo.toml for conditional compilation (
backends/candle/build.rs) - Benchmark and validate correctness: run integration tests and compare latency/throughput before/after (
backends/candle/tests/common.rs)
🔧Why these technologies
- Candle (Hugging Face Rust ML framework) — Native Rust inference with zero Python overhead; tight CUDA integration and custom kernel support for Flash Attention and optimized linear layers; enables compiled binary deployment
- NVIDIA CUDA + cuBLASLt + Flash Attention v2 — Achieves 10-100× speedup over CPU for transformer inference; Flash Attention reduces memory I/O bottleneck; cuBLASLt provides vendor-optimized matrix multiplication
- Tokio async runtime — Handles concurrent HTTP requests without blocking; enables efficient batching of embeddings across multiple client requests
- Safetensors format — Safe, fast deserialization of model weights; zero-copy memory mapping; standard in Hugging Face ecosystem
- Multiple backends (Candle, ORT, Python) — Flexibility for users: Candle for native performance, ORT for ONNX models, Python for PyTorch compatibility; trade-off between speed and model coverage
⚖️Trade-offs already made
-
Native Rust (Candle) as primary backend instead of Python/PyTorch
- Why: Eliminates GIL contention, enables true async concurrency, reduces memory footprint, ships as single binary
- Consequence: Requires reimplementation of models; does not support arbitrary Python model code; steeper learning curve for PyTorch users
-
Flash Attention as opt-in optimization (feature flag)
- Why: Maximizes inference speed on NVIDIA GPUs with compute capability >= 8.0
- Consequence: Falls back to standard attention on older GPUs or non-NVIDIA hardware; requires separate CUDA kernel compilation
🪤Traps & gotchas
CUDA/cudarc version pinning: Cargo.toml patches cudarc to a specific Git commit; upgrading without matching CUDA driver version will cause runtime segfaults. Model weight format: Models must be in Safetensors or ONNX format; HuggingFace transformers pickles will fail silently. Token pool saturation: Max total tokens across the batch pool (--max-total-tokens) is separate from max-batch-size; exceeding it causes request queuing/timeout. Environment vars: HUGGING_FACE_HUB_TOKEN required for gated models; CUDA_HOME and CUDA_PATH must be set for GPU builds. Alibi vs RoPE vs absolute positional embeddings: Different model families use different position encoding; mismatching model code to backend position logic causes NaN embeddings. Flash Attention compute capability: Requires GPU with sm_70+ (Volta+); older hardware falls back to slower implementations without warning.
🏗️Architecture
💡Concepts to learn
- Flash Attention v2 — Core optimization reducing embedding inference latency from O(n²) to near-linear complexity via IO-aware attention; TEI's flash_attn.rs implements this kernel, critical for sequence lengths >256 tokens
- Token-Based Dynamic Batching — TEI batches variable-length sequences by consuming a fixed token budget (e.g., 512 tokens max per batch) rather than padded examples; allows 10-100x throughput gains on realistic workloads with heterogeneous lengths
- RoPE (Rotary Position Embeddings) & ALiBi (Attention with Linear Biases) — TEI supports multiple position encoding schemes (absolute, ALiBi in alibi.rs, RoPE in rotary.rs); choosing wrong encoding for a model produces garbage embeddings; essential to understand for model porting
- cuBLASLt (Batch GEMM via cuBLAS) — TEI's
layers/linear.rsuses cuBLASLt for fused matrix multiplication on NVIDIA GPUs; provides 2-3x speedup over naive CUDA kernels via tensor core usage and layout optimization - Safetensors Format — TEI prefers Safetensors over PyTorch pickles for model weights (safer deserialization, memory-mapped I/O support); loading incorrect format silently fails or panics in Candle
- Mean Pooling & Normalization Strategies — Different embedding models use different output pooling (mean of token embeddings, CLS token, max pooling) and L2 normalization; TEI's pooling trait in
core/src/abstracts these to compute final embeddings correctly - Workspace Dependencies & Patch Resolution — Cargo.toml uses
[patch.crates-io]to pin cudarc and Candle to specific Git revisions; breaking these pins causes silent incompatibilities; critical for maintainability of multi-crate monorepo
🔗Related repos
huggingface/candle— Underlying ML framework used by TEI backends for model inference; directly integrated as workspace dependencyhuggingface/transformers— Source of model definitions (BERT, RoBERTa, GTE) that TEI reimplements in Rust for inference; referenced for weight format and tokenization compatibilityvllm-project/vllm— Complementary inference engine for LLM/generative models with similar dynamic batching philosophy; TEI focuses on embeddings, vLLM on autoregressive generationonnx/onnx-runtime— Alternative inference backend supported by TEI viabackends/ort/; ONNX Runtime provides portable model optimization across hardwareopentelemetry/opentelemetry-rust— Used by TEI for distributed tracing and observability; enables production monitoring of embedding inference pipelines
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add integration tests for all supported embedding models in backends/candle/src/models/
The repo has 15+ embedding model implementations (bert.rs, debertav2.rs, flash_gte.rs, flash_jina.rs, etc.) but there are no visible model-specific integration tests. This is critical for a production inference service where model correctness directly impacts user embeddings quality. Each model should have tests verifying: (1) correct tensor shapes for different batch sizes, (2) numerical stability across different input sequences, (3) consistency between standard and flash variants (e.g., bert.rs vs flash_bert.rs).
- [ ] Create backends/candle/tests/integration_tests.rs with test harness
- [ ] Add parametrized tests for each model in backends/candle/src/models/*.rs covering bs=1, bs=32 batches
- [ ] Verify flash variants produce numerically similar results to standard variants within tolerance
- [ ] Test edge cases: empty sequences, max length sequences, unicode handling
Add GitHub Actions workflow for multi-architecture Docker image builds and registry push
The repo has multiple Dockerfiles for different architectures (Dockerfile, Dockerfile-arm64, Dockerfile-cuda, Dockerfile-cuda-all, Dockerfile-intel) but no visible CI workflow in .github/workflows/ that builds, tests, and publishes these images. This is a major gap for a containerized inference service. A workflow should build all variants on each release, run basic smoke tests in containers, and push to Docker Hub/GHCR.
- [ ] Create .github/workflows/docker-build.yaml with matrix strategy for [linux/amd64, linux/arm64]
- [ ] Use docker/build-push-action with conditional registry push on version tags
- [ ] Add smoke test step: pull image, run docker run with sample embedding request via HTTP
- [ ] Document pushed image tags in CONTRIBUTING.md
Add comprehensive backend benchmark suite in new backends/benches/ directory
The README shows impressive benchmark assets (bs1-lat.png, bs32-tp.png) suggesting existing benchmarks exist, but there's no visible benchmarks/ directory in the repo structure for reproducible performance testing. Contributors and users need a standardized way to compare backends (candle vs ort vs python), models, batch sizes, and hardware. This should be a criterion benchmark suite.
- [ ] Create backends/benches/Cargo.toml with criterion dependency
- [ ] Add backends/benches/embedding_throughput.rs benchmarking inference latency across batch_size=[1,8,32]
- [ ] Add backends/benches/memory_usage.rs measuring peak memory for different models
- [ ] Create script to generate benchmark comparison tables (latency/throughput) and update README with reproducibility instructions
🌿Good first issues
- Add unit tests for all layer implementations in
backends/candle/src/layers/(layer_norm.rs, rotary.rs, rms_norm.rs currently lack comprehensive test coverage for edge cases like bfloat16 dtypes and variable sequence lengths) - Extend model support documentation: create a runnable example notebook in a new
examples/directory showing how to add a custom embedding model (e.g., a new Qwen variant) following the trait API incore/src/andbackends/candle/src/models/ - Add e2e performance benchmarking harness: create a
benches/directory with criterion benchmarks for common operations (token batching, embedding inference, pooling strategies) on CPU, matching the asset images inassets/to track performance regression across releases
⭐Top contributors
Click to expand
Top contributors
- @alvarobartt — 45 commits
- @kozistr — 9 commits
- @michaelfeil — 9 commits
- @vrdn-23 — 6 commits
- @kaixuanliu — 6 commits
📝Recent commits
Click to expand
Recent commits
5bc4d88— chore: bump doc-builder SHA for PR upload workflow (#862) (rtrompier)1588129— [AMD] Add Instinct GPU setup guide (#856) (Abdennacer-Badaoui)f4483f3— Add dedicated linux/arm64 runner (#858) (alvarobartt)908424d— 🔒 Pin GitHub Actions to commit SHAs (#855) (paulinebm)ba265a3— Add minimal AMD ROCm GPU support (#853) (Abdennacer-Badaoui)cc7ca31— Fix to only buildDockerfile-cudaw/ Blackwell 12.1 on linux/arm64 (#852) (alvarobartt)9dd0c15— Supportharrier-oss-v1model (#854) (kozistr)c14ee47— feat: updatemetricspackage to support modern rust (#850) (daeho-ro)2e690c2— feat: multi-arch CUDA Dockerfile and sm_121 (DGX Spark GB10) (#840) (nazq)a2c07dd— Add Dockerfile for ARM64 architecture support and update README instructions (#827) (z4y4ts)
🔒Security observations
- High · Incomplete Cargo.toml Dependency Lock —
Cargo.toml - [patch.crates-io] section. The Cargo.toml file contains patched dependencies with git references that are truncated in the provided content. The candle dependency patch shows an incomplete git revision hash, which could lead to pulling unverified or unintended code during builds. Fix: Ensure all git-based dependencies use complete, verified commit hashes. Review the full Cargo.toml to confirm all patches reference specific, reviewed commits. Consider using git tags with signatures instead of raw commit hashes. - High · Insecure Binary Download in Dockerfile —
Dockerfile - sccache download step. The Dockerfile downloads sccache binary from GitHub releases without verifying checksum or signature. This introduces a supply chain attack vector where a compromised release could execute arbitrary code during the build process. Fix: Implement checksum verification for downloaded binaries. Download from official sources with GPG signature verification. Consider pinning to a specific release digest or using a pre-built image with verified binaries. - High · Unverified APT Repository Key Installation —
Dockerfile - Intel APT key installation. The Dockerfile downloads and installs Intel APT repository key without verifying the key's fingerprint or authenticity. This could allow MITM attacks to inject malicious packages during dependency installation. Fix: Verify the GPG key fingerprint against Intel's official documentation before importing. Pin the key fingerprint explicitly in the Dockerfile. Consider using official base images that include verified keys. - Medium · Use of Generic Latest Tag for Base Image —
Dockerfile - FROM statement. The Dockerfile uses 'lukemathwalker/cargo-chef:latest-rust-1.92-bookworm' which includes 'latest'. While the Rust version is pinned, the cargo-chef component may be updated unexpectedly, potentially introducing breaking changes or security issues. Fix: Pin the cargo-chef image to a specific digest or tag (e.g., 'latest-rust-1.92-bookworm-2024.xx.xx') rather than using 'latest'. Use image scanning tools to verify base image security. - Medium · Multiple Patch Dependencies on External Git Repositories —
Cargo.toml - candle-flash-attn, candle-cublaslt, candle-layer-norm, candle-index-select-cu, candle-rotary, candle-flash-attn-v1. The project relies on multiple patched versions of candle-* packages from GitHub instead of stable crates.io versions. This increases maintenance burden and introduces supply chain risk if repositories are compromised or become unavailable. Fix: Monitor when these patches can be migrated to official crates.io releases. Implement dependency scanning and regular audits for git-based dependencies. Consider contributing changes upstream to accelerate stabilization. - Medium · Hardcoded Version in Dockerfile Environment Variable —
Dockerfile - ENV SCCACHE=0.10.0. SCCACHE version is hardcoded as '0.10.0' in the Dockerfile. If this version contains vulnerabilities, the build process will continue using the affected version unless manually updated. Fix: Implement a mechanism to regularly check for and update sccache versions. Use build arguments to allow version override. Integrate dependency scanning into CI/CD pipeline. - Low · Missing Security Headers and Container Hardening —
Dockerfile. The Dockerfile does not include common security hardening practices such as running as non-root user, disabling capabilities, or setting read-only root filesystem. Fix: Add 'RUN useradd -m -u 1000 appuser' and 'USER appuser' before final stages. Use SECCOMP and AppArmor profiles. Document recommended Docker runtime flags (--security-opt, --cap-drop). - Low · No SBOM or Supply Chain Metadata —
Repository root. The repository does not appear to include Software Bill of Materials (SBOM) or documented supply chain verification processes visible in the provided structure. Fix: Generate and maintain SBOM using tools like cargo-sbom or syft. Implement SLSA provenance framework for build artifacts. Document dependency verification process.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.