RepoPilotOpen in app →

rhasspy/piper

A fast, local neural text to speech system

Healthy

Healthy across all four use cases

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 9mo ago
  • 14 active contributors
  • MIT licensed
Show 4 more →
  • CI configured
  • Tests present
  • Slowing — last commit 9mo ago
  • Concentrated ownership — top contributor handles 67% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Healthy
[![RepoPilot: Healthy](https://repopilot.app/api/badge/rhasspy/piper)](https://repopilot.app/r/rhasspy/piper)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/rhasspy/piper on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: rhasspy/piper

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/rhasspy/piper shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

  • Last commit 9mo ago
  • 14 active contributors
  • MIT licensed
  • CI configured
  • Tests present
  • ⚠ Slowing — last commit 9mo ago
  • ⚠ Concentrated ownership — top contributor handles 67% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live rhasspy/piper repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/rhasspy/piper.

What it runs against: a local clone of rhasspy/piper — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in rhasspy/piper | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | Last commit ≤ 286 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>rhasspy/piper</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of rhasspy/piper. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/rhasspy/piper.git
#   cd piper
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of rhasspy/piper and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "rhasspy/piper(\\.git)?\\b" \\
  && ok "origin remote is rhasspy/piper" \\
  || miss "origin remote is not rhasspy/piper (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 286 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~256d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/rhasspy/piper"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

Piper is a fast, local neural text-to-speech (TTS) system that converts text directly to speech audio using ONNX Runtime inference. It synthesizes speech in 40+ languages with pre-trained models, requiring no cloud services or external APIs. The core C++ implementation (995K lines) wraps PyTorch models converted to ONNX format for efficient CPU/GPU inference, with Python bindings for easy integration. Dual-language monorepo: C++ core inference engine under src/cpp/ compiles to ONNX Runtime bindings, Python wrapper layer (src/piper/) provides CLI and library interfaces, training pipeline in training/ (Jupyter notebooks), and Docker containerization via Dockerfile. Models stored externally (referenced in VOICES.md); voice datasets and test sentences in etc/test_sentences/ organized by language code (ar.txt, en.txt, etc).

👥Who it's for

Developers and Home Assistant contributors building local voice assistants, IoT systems, or embedded TTS applications who need neural-quality speech synthesis without cloud dependency or privacy concerns. Voice automation enthusiasts and integrators using Home Assistant's voice integration layer.

🌱Maturity & risk

Actively maintained but recently transitioned: development has moved to https://github.com/OHF-Voice/piper1-gpl as indicated in the README. This repository shows solid CI/CD setup (.github/workflows/main.yml) and comprehensive multi-language test data (40+ language test sentences in etc/test_sentences/), but the migration suggests this version is stable rather than the active development branch. Production-ready for deployment, but new features go upstream.

Moderate risk due to upstream migration: new issues and PRs should target the OHF-Voice fork, not this repo. Dependency on specific onnxruntime~=1.11.0 and torch~=1.11.0 versions may cause compatibility issues with modern systems. Heavy C++ codebase (995K lines) means compiled binary dependencies; native build tools (CMake, C++ compiler) required. Single-organization stewardship (Rhasspy project) with development fragmentation across two repos.

Active areas of work

Active maintenance in reduced mode—this repository is stable-state with a note that active development moved to OHF-Voice/piper1-gpl. The GitHub Actions workflow (main.yml) likely handles automated builds/releases. No visible open development PRs in snapshot, but TRAINING.md and VOICES.md suggest ongoing model releases and voice additions managed externally.

🚀Get running

git clone https://github.com/rhasspy/piper.git && cd piper && make && pip install -e . (assuming Make and CMake are installed). Docker alternative: docker build -t piper . && docker run -it piper. Check Makefile for actual build targets.

Daily commands: Command-line: piper --model en_US-hfc_female-medium.onnx --output-raw < text.txt > audio.wav (after installation). Programmatic (Python): import piper; synthesizer = piper.PiperTTS(model_path); audio = synthesizer.synthesize(text). Docker: docker run piper --text 'hello' --output audio.wav. See Makefile for dev build targets (likely make build, make install, make test).

🗺️Map of the codebase

  • CMakeLists.txt: Defines C++ compilation for ONNX Runtime bindings and phoneme synthesis engine
  • src/piper/main.py: Entry point for CLI argument parsing and main text-to-speech synthesis pipeline
  • src/cpp/piper.cpp: Core C++ inference wrapper calling ONNX Runtime models and audio synthesis
  • VOICES.md: Registry of all available pre-trained models, their languages, and download URLs
  • TRAINING.md: Documentation for training custom voices and converting PyTorch models to ONNX
  • .github/workflows/main.yml: CI/CD pipeline for automated builds, tests, and releases across platforms
  • Dockerfile: Containerized deployment configuration for reproducible multi-platform inference
  • etc/test_sentences/: Multi-language test data (40+ languages) for validating synthesis quality and phoneme handling

🛠️How to make changes

Voice model changes: add test sentences to etc/test_sentences/{lang_code}.txt and reference in VOICES.md. C++ inference changes: edit src/cpp/ and rebuild with CMake. Python CLI/API changes: modify src/piper/ (likely main.py for CLI, core TTS logic in synthesizer modules). New language support: add phoneme sets, training data, and register in language config (likely in src/piper/). Training: see TRAINING.md for the full pipeline using Jupyter notebooks in training/.

🪤Traps & gotchas

onnxruntime~=1.11.0 and torch~=1.11.0 are pinned to specific versions—newer system Python/pip may have incompatible constraint resolution. C++ build requires CMake 3.10+, native C++ compiler, and ONNX Runtime development headers (may need separate install). Voice models (.onnx files) downloaded separately and not in repo—missing model path causes runtime errors. Development moved to OHF-Voice/piper1-gpl; issues on this repo may not get attention. Phoneme processing is language-specific; adding languages requires phoneme set definition beyond just text data.

💡Concepts to learn

  • ONNX Runtime Inference — Piper's entire inference engine is built on ONNX Runtime for cross-platform, hardware-optimized model execution; understanding ONNX format and onnxruntime API is essential to contribute to the C++ layer
  • Text-to-Phoneme Conversion — Piper converts language-specific text to phoneme sequences before neural vocoder synthesis; phoneme handling varies by language and is a core customization point for new languages
  • Neural Vocoder — Piper uses neural vocoder models (trained with PyTorch, exported to ONNX) to convert phoneme-derived features into raw audio waveforms; understanding vocoder architecture is key to model training
  • Model Quantization / ONNX Model Conversion — Training uses PyTorch but production uses ONNX-quantized models for inference speed and size reduction; TRAINING.md covers conversion process critical for custom voice development
  • Multi-Language Phoneme Sets — Piper supports 40+ languages each with distinct phoneme inventories; language support requires defining phoneme sets and trained models, visible in test sentence organization (ar.txt, en.txt, etc.)
  • JSONL Test Format — Test sentences stored as JSONL (test_*.jsonl files) with metadata for validating synthesis quality; understanding this format is needed for adding test cases or validation scripts
  • CMake Cross-Compilation — Piper's build uses CMake for portable C++ compilation across Linux, macOS, Windows; contributing to inference engine requires familiarity with CMake build configuration
  • coqui-ai/TTS — Drop-in alternative neural TTS system with broader model architecture support and active development community; Coqui is more feature-rich but less lightweight than Piper
  • mozilla/TTS — Predecessor to Coqui-AI/TTS; foundational work on PyTorch-based speech synthesis that influenced Piper's model training pipeline
  • home-assistant/core — Primary consumer of Piper; integrated as the native local TTS engine for Home Assistant voice automations and voice assistants
  • OHF-Voice/piper1-gpl — Active development fork of this repository; all new features, bug fixes, and community contributions go here; this repo is stable-state mirror
  • espnet/espnet — End-to-end speech processing toolkit that shares training methodology and phoneme processing concepts with Piper's training pipeline

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add language-specific test validation workflow for test_sentences/*.jsonl files

The repo contains 20+ language test sentence files (test_en-us.jsonl, test_fr.jsonl, etc.) in etc/test_sentences/ but no visible CI workflow to validate them. A new GitHub Action workflow could run Piper against each .jsonl test file to catch regressions in voice quality or inference across languages. This is high-value because: (1) multilingual TTS is core to Piper, (2) test sentences exist but aren't automated, (3) the .github/workflows/main.yml likely doesn't cover this, and (4) it prevents silent failures in specific language models.

  • [ ] Create .github/workflows/test-sentences.yml that iterates over etc/test_sentences/test_*.jsonl files
  • [ ] Parse each .jsonl file and call Piper's inference API for each test sentence
  • [ ] Validate output audio files are generated and have non-zero byte size
  • [ ] Run workflow on PR events to catch language model regressions early
  • [ ] Document test sentence format and how to add new language tests in TRAINING.md

Add dependency update validation for onnxruntime and torch versions

The repo pins onnxruntime~=1.11.0 and torch~=1.11.0 (both from 2022). These are old versions with known security patches and performance improvements in newer releases. A high-value PR would: (1) Create a test matrix in CI testing against multiple onnxruntime/torch minor versions (1.11.x, 1.12.x, 1.13.x+), (2) Identify which versions break inference, (3) Document compatibility in a COMPATIBILITY.md file. This prevents users from being stuck on old, vulnerable dependencies and helps the maintainers plan version bumps safely.

  • [ ] Create .github/workflows/dependency-matrix.yml testing torch and onnxruntime version combinations
  • [ ] Run inference tests across at least 3 minor versions of each dependency
  • [ ] Document minimum and maximum supported versions in a new COMPATIBILITY.md file
  • [ ] Update pyproject.toml or requirements.txt with tested version ranges (if less restrictive than ~=1.11.0)
  • [ ] Add a note to README.md about supported dependency versions

Add missing test sentence files for existing languages in Makefile targets

The etc/test_sentences/ directory has .txt files for ~35 languages but only ~20 corresponding .jsonl test case files (e.g., missing test_fa.jsonl, test_hi.jsonl, test_id.jsonl, test_is.jsonl, test_ka.jsonl, test_kk.jsonl, test_ml.jsonl, test_sw.jsonl, test_te.jsonl). This inconsistency suggests test data generation is manual or incomplete. A contributor could: (1) Create a Makefile target or Python script to convert all etc/test_sentences/*.txt into standardized .jsonl format, (2) Document the .jsonl schema, (3) ensure all 35 languages have matching test cases. This is valuable because it enables testing all supported languages and ensures consistency.

  • [ ] Document the expected .jsonl schema for test sentences (likely: {"text": "...", "language": "..."})
  • [ ] Create a Python script (e.g., scripts/generate_test_jsonl.py) that converts .txt files to .jsonl format
  • [ ] Add a Makefile target like 'make generate-test-jsonl' to regenerate all missing test files
  • [ ] Verify generated .jsonl files match the format of existing ones (test_en-us.jsonl, test_fr.jsonl, etc.)
  • [ ] Update TRAINING.md with instructions on adding new test sentences for a language

🌿Good first issues

  • Add automated test script in etc/test_sentences/ to validate synthesis output for all 40+ languages against their respective .jsonl test files; currently only text reference files exist with no validation harness
  • Document C++ API in src/cpp/ with doxygen-compatible comments and a cpp-api.md guide; training.md covers training but native C++ extension usage is undocumented for downstream projects
  • Create GitHub issue template and CONTRIBUTING.md with redirect to https://github.com/OHF-Voice/piper1-gpl to properly route bug reports and feature requests to the active development fork

Top contributors

Click to expand

📝Recent commits

Click to expand
  • 73c04d8 — Add test sentences (synesthesiam)
  • 6583b9c — Merge pull request #250 from csukuangfj/fix-onnx-export (synesthesiam)
  • e8b2dee — Update README.md (synesthesiam)
  • 4147f96 — Update README and voices pages (synesthesiam)
  • 90c0692 — Merge pull request #813 from PravalX/master (synesthesiam)
  • f2cde2f — Update README.md (PravalX)
  • c488ebd — Update VOICES.md (PravalX)
  • f499a3f — Update VOICES.md (synesthesiam)
  • 9b1c639 — Merge pull request #619 from mrdarrengriffin/ohf-logo (synesthesiam)
  • c0670df — Add Latvian test sentences (synesthesiam)

🔒Security observations

The Piper project has moderate security concerns primarily centered on outdated dependencies (onnxruntime and torch from 2022) that likely contain known vulnerabilities. Docker configuration lacks hardening best practices and security context enforcement. The codebase itself does not show obvious injection risks, SQL vulnerabilities, or hardcoded secrets in the visible structure. The main risks are supply chain vulnerabilities from outdated dependencies and container security configuration. Immediate action should focus on updating dependencies and implementing Docker security best practices. No critical code-level vulnerabilities were detected from the static analysis of available files.

  • High · Outdated ONNX Runtime Dependency — Dependencies/Package file (onnxruntime~=1.11.0). The project specifies onnxruntime~=1.11.0, which is significantly outdated (released in 2022). This version may contain known security vulnerabilities that have been patched in newer releases. ONNX Runtime handles model loading and execution, making it a critical component. Fix: Update to the latest stable version of onnxruntime. Run 'pip install --upgrade onnxruntime' and test compatibility. Consider using a more flexible version constraint like onnxruntime>=1.16.0 to allow security patches.
  • High · Outdated PyTorch Dependency — Dependencies/Package file (torch~=1.11.0). The project specifies torch~=1.11.0, which is outdated (released in 2022). This version may contain known vulnerabilities and lacks critical security patches available in newer releases. PyTorch is a core dependency for the TTS system. Fix: Update to torch>=2.0.0 or latest stable version compatible with your ONNX Runtime version. Test thoroughly as this is a major dependency upgrade.
  • Medium · Unrestricted File Operations in Docker Build — Dockerfile (line: COPY ./ ./). The Dockerfile uses 'COPY ./ ./' which copies the entire repository into the build container without filtering. This could inadvertently include sensitive files, build artifacts, or development secrets if present in the source tree. Fix: Use a .dockerignore file to explicitly exclude sensitive files, build artifacts, and unnecessary directories. Ensure .dockerignore contains: .git, .github, .log, node_modules, pycache, .env, .env., secrets, etc.
  • Medium · Missing Runtime Security Hardening in Docker — Dockerfile. The Dockerfile does not include security best practices such as running as non-root user, setting security options, or using minimal base images. The build stage uses debian:bullseye which includes unnecessary packages. Fix: Add USER directive to run as non-root, use multi-stage builds with minimal base images (e.g., debian:bullseye-slim), and remove unnecessary packages. Consider using security scanning in CI/CD pipeline.
  • Medium · Development Dependency in Production — Dockerfile (apt-get install line). The Dockerfile installs build-essential, cmake, curl, and git in the build stage without removing them in final artifacts. If these are carried forward, they increase attack surface. Fix: Ensure build dependencies are not included in final runtime image. Use multi-stage Docker builds with separate final stage that only includes runtime requirements.
  • Low · Lack of Dependency Pinning — Dependencies/Package file. Dependencies use version ranges (~=) rather than exact pins. While this allows patch updates, it makes builds non-deterministic and harder to reproduce exactly. Fix: Generate and maintain a requirements.lock or poetry.lock file with exact pinned versions for all transitive dependencies. Use tools like pip-compile or Poetry to manage this.
  • Low · Missing Security Policy Documentation — Repository root. No SECURITY.md or security policy is evident in the repository structure. This makes it unclear how to report security vulnerabilities responsibly. Fix: Create a SECURITY.md file documenting how to report security vulnerabilities responsibly, security update process, and supported versions.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Healthy signals · rhasspy/piper — RepoPilot