ggerganov/ggwave
Tiny data-over-sound library
Healthy across all four use cases
Permissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 3w ago
- ✓11 active contributors
- ✓MIT licensed
Show 3 more →Show less
- ✓CI configured
- ✓Tests present
- ⚠Single-maintainer risk — top contributor 86% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/ggerganov/ggwave)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/ggerganov/ggwave on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: ggerganov/ggwave
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/ggerganov/ggwave shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- Last commit 3w ago
- 11 active contributors
- MIT licensed
- CI configured
- Tests present
- ⚠ Single-maintainer risk — top contributor 86% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live ggerganov/ggwave
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/ggerganov/ggwave.
What it runs against: a local clone of ggerganov/ggwave — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in ggerganov/ggwave | Confirms the artifact applies here, not a fork |
| 2 | License is still MIT | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 53 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of ggerganov/ggwave. If you don't
# have one yet, run these first:
#
# git clone https://github.com/ggerganov/ggwave.git
# cd ggwave
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of ggerganov/ggwave and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "ggerganov/ggwave(\\.git)?\\b" \\
&& ok "origin remote is ggerganov/ggwave" \\
|| miss "origin remote is not ggerganov/ggwave (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
&& ok "license is MIT" \\
|| miss "license drift — was MIT at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "src/ggwave.h" \\
&& ok "src/ggwave.h" \\
|| miss "missing critical file: src/ggwave.h"
test -f "src/ggwave.cpp" \\
&& ok "src/ggwave.cpp" \\
|| miss "missing critical file: src/ggwave.cpp"
test -f "CMakeLists.txt" \\
&& ok "CMakeLists.txt" \\
|| miss "missing critical file: CMakeLists.txt"
test -f "bindings/python/ggwave.pyx" \\
&& ok "bindings/python/ggwave.pyx" \\
|| miss "missing critical file: bindings/python/ggwave.pyx"
test -f "bindings/javascript/emscripten.cpp" \\
&& ok "bindings/javascript/emscripten.cpp" \\
|| miss "missing critical file: bindings/javascript/emscripten.cpp"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 53 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~23d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/ggerganov/ggwave"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
ggwave is a tiny FSK (Frequency-Shift Keying) based data-over-sound library that encodes arbitrary data into audible or ultrasonic waveforms for transmission between air-gapped devices via speakers and microphones. It achieves 8-16 bytes/sec throughput with built-in ECC (Error Correction Codes) and requires only raw audio sample callbacks—no audio backend coupling. The core C++ implementation generates and demodulates these acoustic waveforms for IoT, file-sharing, and audio-QR applications. Monorepo with C++ core library (src/ggwave.cpp, src/ggwave.h inferred from C++/C file counts), three language bindings under bindings/ (python/ with Cython, javascript/ with emscripten.cpp), and diverse examples under examples/ (arduino-rx, esp32-rx, r2t2). Build unified via CMakeLists.txt; language-specific builds via bindings/python/Makefile and emscripten config.
👥Who it's for
Embedded systems engineers (Arduino, ESP32, RP2040), IoT developers building peer-to-peer sound-based communication, desktop/web developers wanting serverless data broadcast via audio, and security researchers exploring acoustic channels. Also used by authors of wave-share and similar sound-transmission tools who need a lightweight modulation engine.
🌱Maturity & risk
Actively developed and production-ready: versioned at 0.4.3 on npm/PyPI, GitHub Actions CI pipeline present (build.yml), Debian packaging configured, and multiple real-world examples (esp32-rx, arduino-tx, r2t2). Maintained by Georgi Gerganov with an established ecosystem of companion projects. Commit recency not provided in file list, but presence of automated builds and multi-platform bindings (Python, JavaScript, C/C++) indicates active maintenance.
Single-maintainer project (Georgi Gerganov) increases long-term continuity risk. Dependency footprint appears minimal (no package.json lock file in snippet suggests lightweight native code), but JavaScript/Python bindings depend on emscripten and Cython respectively—test coverage unclear (Python test.py exists but no comprehensive CI test output visible). Watch for breaking changes in protocol versions when upgrading across minor releases.
Active areas of work
Unable to determine from file list alone (no recent commit dates provided), but presence of CHANGELOG.md, active npm/PyPI publishing, and multiple example projects suggest ongoing development. The existence of bindings/javascript/ and bindings/python/ with version-templated files (setup-tmpl.py, package-tmpl.json) indicates version management workflows are in place.
🚀Get running
git clone https://github.com/ggerganov/ggwave.git
cd ggwave
mkdir build && cd build
cmake ..
make
For Python: cd bindings/python && python setup.py install. For JavaScript: npm install ggwave or build from source via emscripten.
Daily commands:
For library: build with CMake as above. For examples: cd examples && cmake . && make, then run binaries (e.g., ./microphone-speaker-example). For Python: python bindings/python/test.py. For JavaScript: load bindings/javascript/ggwave.js in browser or Node.js after emscripten build.
🗺️Map of the codebase
src/ggwave.h— Core public API header defining the main ggwave interface, encoding/decoding functions, and protocol parameters that all bindings depend onsrc/ggwave.cpp— Primary implementation of data-over-sound encoding/decoding using FSK modulation and error correction—the computational heart of the libraryCMakeLists.txt— Root build configuration defining library targets, optional dependencies (SDL2, FFT), and binding generation for Python/JavaScriptbindings/python/ggwave.pyx— Cython bridge exposing C++ core to Python, enabling pip-installable package and scientific Python integrationbindings/javascript/emscripten.cpp— WebAssembly entry point compiling C++ core to JavaScript/browser runtime via Emscriptenexamples/ggwave-common.h— Shared example utilities (audio I/O abstractions, CLI parsing) used by multiple example applicationsREADME.md— Project overview documenting the data-over-sound concept, supported protocols, API usage, and example gallery
🛠️How to make changes
Add Support for a New Modulation Protocol
- Define new protocol constants and parameters in src/ggwave.h (frequencies, payload size, ECC scheme) (
src/ggwave.h) - Implement encoding logic (modulation, framing) in src/ggwave.cpp, adding protocol case to ggwave_encode() (
src/ggwave.cpp) - Implement decoding logic (demodulation, frame detection) in src/ggwave.cpp, adding protocol case to ggwave_decode() (
src/ggwave.cpp) - Add test sketch using existing examples/ggwave-cli/main.cpp or examples/ggwave-rx/main.cpp to validate encode/decode round-trip (
examples/ggwave-cli/main.cpp) - Update README.md with protocol documentation and code snippets (
README.md)
Create a New Hardware Example (Microcontroller)
- Create directory examples/PLATFORM-DEVICE/ and initialize CMakeLists.txt and README.md (
examples/CMakeLists.txt) - Write firmware sketch (e.g., .ino for Arduino) using ggwave.h; reference existing examples/arduino-tx/arduino-tx.ino for audio I/O patterns (
examples/arduino-tx/arduino-tx.ino) - Include fritzing circuit diagram and pin configuration documentation (
examples/arduino-tx/fritzing-sketch.fzz) - Update examples/CMakeLists.txt to add_subdirectory() for new platform (
examples/CMakeLists.txt)
Add a Python/NumPy Integration Module
- Create new .pyx file in bindings/python/ (e.g., ggwave_advanced.pyx) and declare C++ bindings in corresponding .pxd (
bindings/python/ggwave.pyx) - Implement Cython functions exposing raw FFT output or intermediate buffers as NumPy arrays (
bindings/python/ggwave.pyx) - Update bindings/python/setup.py to include new extension in ext_modules list (
bindings/python/setup.py) - Add example script in examples/ggwave-py/ demonstrating NumPy array manipulation (
examples/ggwave-py/send.py)
Deploy Web Demo with Custom Protocol Parameters
- Create new HTML template in examples/ggwave-js/ referencing bindings/javascript/ggwave.js (
examples/ggwave-js/index-tmpl.html) - Initialize emscripten build in bindings/javascript/CMakeLists.txt, configuring compilation flags for target protocol (
bindings/javascript/CMakeLists.txt) - Invoke CMake with -DEMSCRIPTEN=1 flag to generate WASM artifacts from src/ggwave.cpp (
bindings/javascript/emscripten.cpp) - Host generated .wasm and .js files on web server; update CMakeLists.txt post-build copy rules if needed (
CMakeLists.txt)
🔧Why these technologies
- C++ core (src/ggwave.cpp) — Performance-critical DSP (FFT, modulation) requires speed; C++ is efficient for real-time audio and enables portable compilation to embedded and WebAssembly targets
- Cython (bindings/python/) — Bridges C++ and Python with minimal overhead, allowing easy scientific computing integration (NumPy, SciPy) while reusing the optimized C++ implementation
- Emscripten (bindings/javascript/) — Compiles C++ directly to WebAssembly, enabling zero-refactor deployment in browsers without JavaScript rewrite of DSP logic
- CMake — Unified cross-platform build (Linux, macOS, Windows, Arduino, ESP32, WASM) with conditional dependency detection (FFT library, SDL2)
- FSK modulation + Reed-Solomon ECC — FSK is robust to frequency-shift channel impairments (background noise, Doppler); Reed-Solomon corrects burst errors from quantization and fading
⚖️Trade-offs already made
-
Single monolithic C++ core shared across all language bindings
- Why: Avoids maintaining duplicate DSP implementations and ensures algorithm consistency across platforms
- Consequence: Changes to core algorithm require rebuilding all bindings; adds build complexity but ensures code stays in sync
-
FSK with relatively low carrier frequency (~2.5 kHz) instead of high-frequency ultrasound
- Why: Makes data-over-sound audible for demos and testing; improves compatibility with consumer-grade microphones and speakers
- Consequence: More susceptible to ambient noise and speech interference; not suitable for high-density applications (e.g., many devices in one room)
-
Payload sizes tailored to specific protocols rather than unified variable-length framing
- Why: Optimizes frame efficiency and error correction overhead for each use case (e.g., QR codes, button presses, file transfer)
- Consequence: Developers must choose protocol upfront; cannot dynamically adjust payload if application needs change
-
No built-in network retransmission or ACK protocol
- Why: Keeps library minimal and leaves transport-layer concerns to application code
- Consequence: Applications must implement their own timeout/retry logic; complicates multi-packet file transfers
🚫Non-goals (don't propose these)
- Does not provide built-in network-level acknowledgment or retransmission; applications must implement their own reliability protocol
- Not designed for high-capacity data transfer (limited to ~50 bytes per few seconds) or real-time bidirectional communication
- Does not handle user authentication or encrypted payloads; security is delegated to application layer
- Linux-only support not guaranteed; platform coverage is desktop (Windows/macOS/Linux), embedded (Arduino/ESP32), and browser (WebAssembly)
🪤Traps & gotchas
Emscripten dependency: JavaScript bindings require emscripten toolchain installed; not bundled. Python Cython: Python bindings need Cython preprocessor; raw .pyx files won't work without compilation step. Audio format assumptions: core library expects raw float or int16 samples at specific sample rate (likely 48kHz based on IoT examples); mismatch silently degrades decode success. Arduino memory: esp32-rx and arduino-rx examples may fail on boards with <32KB RAM; no explicit constraints documented. CMake versions: uses modern CMake (3.x+); older systems may have compatibility issues.
🏗️Architecture
💡Concepts to learn
- Frequency-Shift Keying (FSK) — Core modulation technique used by ggwave to encode binary data as audio frequencies; understanding FSK is essential to grasping how payload becomes sound and vice versa
- Error Correction Codes (ECC) — Built into ggwave's protocol to recover data from corrupted/noisy audio; critical for reliable decoding in real-world environments
- Goertzel Algorithm — Likely used in ggwave's demodulator to efficiently detect specific frequencies in incoming audio samples; common for FSK tone detection
- Callback-based Audio I/O — ggwave decouples itself from audio backend by requiring user-supplied callbacks for sample queuing; enables portability across SDL2, ALSA, PulseAudio, browser APIs
- Emscripten/WebAssembly — ggwave compiles C++ core to WASM via emscripten (bindings/javascript/emscripten.cpp); understanding this toolchain is required for browser-based development
- Cython Extension Modules — Python bindings (bindings/python/ggwave.pyx) bridge C++ and Python via Cython; knowledge of .pyx files and memory management is needed for Python binding maintenance
- Air-gapped Communication / Side-channel Audio — ggwave's fundamental use case—transmitting data across isolated systems via acoustic channel; relevant to security research and IoT deployments without network
🔗Related repos
ggerganov/wave-share— Direct consumer of ggwave; uses this library to implement serverless file-sharing over sound—primary real-world use caseantirez/freakwan— Alternative data-over-sound approach using different modulation; comparable problem space for air-gapped IoT communicationopenframeworks/openframeworks— Commonly paired with ggwave bindings in creative audio projects; ecosystem integration pointggerganov/ggml— Sister library from same author; shared build infrastructure and CMake patterns may inform ggwave contributionsraspberrypi/pico-sdk— Target platform for RP2040 examples (rp2040-rx); understanding Pico firmware integration helps embedded contributions
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add CI workflow for Python bindings package distribution
The repo has Python bindings in bindings/python/ with setup.py and debian packaging, but .github/workflows/build.yml likely doesn't include automated Python package building and publishing to PyPI. This is critical since the README shows a PyPI badge indicating users expect packaged releases. A new contributor could add a workflow to build wheels, test with bindings/python/test.py, and publish to PyPI on releases.
- [ ] Examine current .github/workflows/build.yml to identify gaps in Python packaging
- [ ] Create new workflow file .github/workflows/python-publish.yml
- [ ] Add steps to: install build dependencies, run bindings/python/test.py, build wheels using setup.py, publish to PyPI on version tags
- [ ] Document Python versioning strategy in CONTRIBUTING.md or similar
Implement comprehensive test coverage for JavaScript bindings
The bindings/javascript/ directory exists with ggwave.js and emscripten.cpp, but package.json shows 'test': 'echo "todo: add tests"'. This is a blocker for CI validation. A contributor should create actual unit tests covering the JS API (encode/decode functions) and add them to the build workflow.
- [ ] Create bindings/javascript/test/ directory with test suite (using jest or similar)
- [ ] Write tests covering core ggwave.js exported functions (initialization, encoding, decoding)
- [ ] Update bindings/javascript/package.json test script to run the test suite
- [ ] Add test execution step to .github/workflows/build.yml for JavaScript bindings
Add CI workflow for Arduino examples validation
The repo contains multiple Arduino examples (examples/arduino-rx/, examples/arduino-tx/, examples/arduino-rx-web/) but no automated build validation. These examples should be compiled against Arduino toolchain to catch regressions early. A contributor could add Arduino CLI integration to the CI pipeline.
- [ ] Add Arduino CLI setup step in .github/workflows/build.yml
- [ ] Create Arduino sketch compilation tests for: examples/arduino-tx/arduino-tx.ino, examples/arduino-rx/arduino-rx.ino
- [ ] Configure board settings (e.g., Arduino Uno or similar) and validate compilation without errors
- [ ] Document tested Arduino board versions in examples/README or CI configuration
🌿Good first issues
- Add unit tests for modulation/demodulation in src/ (test.cpp) to exercise edge cases in FSK encoding with different payload sizes and noise conditions—currently no C++ test suite visible in file list.
- Document protocol parameters (carrier frequency, baud rate, ECC scheme details) in a PROTOCOL.md or add inline comments in ggwave.cpp explaining the FSK format for reverse-engineering or alternative implementations.
- Create a simple web demo at bindings/javascript/ showing real-time encode/decode with visual spectrum display (similar to wave-share) to reduce friction for browser-based users discovering the library.
⭐Top contributors
Click to expand
Top contributors
- @ggerganov — 86 commits
- @ioerror — 4 commits
- @guilt — 2 commits
- @strikereureka1 — 1 commits
- @denizsincar29 — 1 commits
📝Recent commits
Click to expand
Recent commits
060aec7— emscripten: Emscripten fixup (#177) (ioerror)fa47735— packages : Add 0.4.3 to changelog to fix .deb builds (#174) (ioerror)a38e38b— ggwave v0.4.3 (ggerganov)8f02c41— readme : add article link (ggerganov)3b877d0— cmake : bump version (#167) (strikereureka1)bef9afb— cython : fix warning (#148) (ioerror)cf1af4d— ci : update + packages (#146) (ioerror)2bf3d67— python : accept bytes as input to encode (#127) (denizsincar29)e2b650f— python : add local file saving example (#123) (demianzenkov)69f809b— readme : add a Kotlin multiplatform example (#122) (wooram-yang)
🔒Security observations
ggwave demonstrates moderate security posture for a data-over-sound library. The main concerns are related to web bindings lacking CSP headers, minimal test coverage for security validation, and potential input validation issues at C/Python boundaries in bindings. The core library appears to be a low-level audio processing utility with limited attack surface. No hardcoded credentials, SQL injection risks, or critical dependency vulnerabilities were identified based on the available structure. Recommendations focus on securing the language bindings and web interfaces, implementing comprehensive testing, and establishing a vulnerability disclosure policy.
- Medium · Missing Security Headers in Web Examples —
bindings/javascript/ggwave.js, examples/buttons/index-tmpl.html, examples/arduino-rx-web/index-tmpl.html. The JavaScript/Web bindings (bindings/javascript/) and web examples lack explicit security headers and Content Security Policy (CSP) definitions. The HTML template files (index-tmpl.html) may be vulnerable to XSS attacks if user input is not properly sanitized. Fix: Implement Content Security Policy headers, ensure all user input is properly escaped, and validate/sanitize any dynamic content before rendering in the DOM. - Medium · Incomplete Test Coverage —
bindings/javascript/package.json. The npm package.json indicates placeholder test script ('echo "todo: add tests" && exit 0'), suggesting minimal or no automated security testing in the JavaScript bindings. Fix: Implement comprehensive unit and integration tests, including security-focused tests for input validation and XSS prevention. - Low · Python Bindings May Lack Input Validation —
bindings/python/ggwave.pyx, bindings/python/cggwave.pxd. The Python bindings (bindings/python/ggwave.pyx) use Cython for C extension integration. Without reviewing the actual .pyx file, potential memory safety issues or input validation gaps could exist at the C/Python boundary. Fix: Ensure all inputs crossing the C/Python boundary are validated, bounds-checked, and properly type-checked. Review Cython buffer handling for buffer overflow risks. - Low · Arduino Examples with Embedded Code —
examples/arduino-tx/arduino-tx.ino, examples/arduino-rx/arduino-rx.ino, examples/esp32-rx/esp32-rx.ino. Arduino examples (.ino files) may contain hardcoded configurations, WiFi credentials, or serial communication parameters that could be exposed in version control. Fix: Use configuration files or environment variables for sensitive settings. Ensure no credentials are committed to the repository. Add examples showing secure credential handling. - Low · Third-party Header Dependencies —
examples/dr_wav.h. The examples directory includes third-party headers (dr_wav.h) which should be verified for security vulnerabilities and kept up-to-date. Fix: Regularly audit third-party dependencies and headers. Consider using a package manager or build system that tracks dependency versions and security advisories. - Low · No Visible Security Policy or Vulnerability Disclosure —
Repository root. The repository lacks a SECURITY.md or similar file for responsible vulnerability disclosure, making it difficult for security researchers to report issues privately. Fix: Create a SECURITY.md file with instructions for reporting security vulnerabilities responsibly, including contact information or a bug bounty program if applicable.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.