bytedance/sonic
A blazingly fast JSON serializing & deserializing library
Healthy across the board
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit today
- ✓17 active contributors
- ✓Apache-2.0 licensed
Show all 6 evidence items →Show less
- ✓CI configured
- ✓Tests present
- ⚠Concentrated ownership — top contributor handles 50% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/bytedance/sonic)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/bytedance/sonic on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: bytedance/sonic
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/bytedance/sonic shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across the board
- Last commit today
- 17 active contributors
- Apache-2.0 licensed
- CI configured
- Tests present
- ⚠ Concentrated ownership — top contributor handles 50% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live bytedance/sonic
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/bytedance/sonic.
What it runs against: a local clone of bytedance/sonic — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in bytedance/sonic | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | Last commit ≤ 30 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of bytedance/sonic. If you don't
# have one yet, run these first:
#
# git clone https://github.com/bytedance/sonic.git
# cd sonic
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of bytedance/sonic and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "bytedance/sonic(\\.git)?\\b" \\
&& ok "origin remote is bytedance/sonic" \\
|| miss "origin remote is not bytedance/sonic (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 30 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~0d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/bytedance/sonic"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
Sonic is a high-performance JSON serialization/deserialization library for Go that achieves 2–8× throughput gains over the standard library by combining JIT compilation and SIMD vectorization. It provides both zero-copy AST parsing (in ast/) and binding-based marshaling (in decoder/, encoder/) without requiring code generation, making it a drop-in replacement for encoding/json. Flat-ish structure with two main entry points: api.go (top-level public API), ast/ package (AST-based parsing and node manipulation), and decoder/ package (streaming deserialization with native/compat backends). Assembly kernels (*.s files) contain vectorized hot paths. Tests are co-located (*_test.go) and testdata lives in testdata_test.go files per module.
👥Who it's for
Go backend engineers and systems developers who process large volumes of JSON (APIs, data pipelines, microservices) and need dramatically faster throughput without sacrificing API compatibility or adding build-time code generation steps.
🌱Maturity & risk
Production-ready and actively developed. The codebase shows mature engineering: comprehensive test suites across *_test.go files, multi-platform CI coverage (Windows, ARM64, x86 via .github/workflows/), support for Go 1.18–1.26, and clear documentation. Recent work includes fuzzing, compatibility testing, and performance benchmarking, indicating ongoing active maintenance.
Low risk for adoption but moderate complexity overhead. Dependencies on low-level libraries (bytedance/gopkg, cloudwego/base64x, klauspost/cpuid, golang.org/x/arch) for SIMD/CPU detection add some transitive dependency management burden. Known issue: Go 1.24.0 requires -ldflags="-checklinkname=0" workaround. Assembly code in ast/asm.s and decoder/ means platform-specific bugs could affect behavior, though CI mitigates this.
Active areas of work
Active CI/CD workflows indicate ongoing work on compatibility testing, benchmarking, fuzzing, and cross-platform support (test-arm64.yml, test-x86.yml, compatibility_test.yml). The presence of version-specific notes (Go 1.24.0 issue in README) suggests rapid iteration on Go compatibility.
🚀Get running
git clone https://github.com/bytedance/sonic.git
cd sonic
go mod download
go test ./...
Daily commands:
No server. Run tests with go test -v ./... or go test -bench . ./... for benchmarks (invoked in CI via benchmark.yml). For manual perf testing, see decoder/testdata_test.go which contains Medium/Large JSON fixtures.
🗺️Map of the codebase
- api.go: Public entry point—defines
Marshal(),Unmarshal(),NewDecoder(),NewEncoder()stubs that route to underlying implementations - ast/parser.go: Core parsing logic that builds the AST; critical for understanding how JSON text → Node tree happens
- decoder/decoder_native.go: Hot path for deserialization with platform-specific optimizations; where most performance gains originate
- ast/asm.s: SIMD vectorized kernels (amd64/arm64); crucial for understanding speed claims—contains hand-tuned Assembly
- ast/node.go: AST node definition and manipulation—central data structure for zero-copy JSON tree access
- ast/encode.go: Serialization logic; shows how Sonic converts Go values/AST back to JSON bytes
- .github/workflows/test-x86.yml: CI pipeline definition—reveals supported Go versions, test matrix, and platform-specific build flags
🛠️How to make changes
New decoder features: edit decoder/decoder_native.go (parsing logic) and add tests in decoder/decoder_native_test.go. New AST operations: add visitor methods to ast/visitor.go, implement in ast/visitor.go, test in ast/visitor_test.go. SIMD optimizations: modify ast/asm.s or Assembly in decoder/ (requires amd64/arm64 knowledge). API additions: start in api.go, add tests in api_test.go, ensure compat layer mirrors changes in compat.go.
🪤Traps & gotchas
Assembly complexity: asm.s and decoder Assembly require deep amd64/arm64 knowledge; changes here can silently break on specific CPU generations without proper testing. JIT caching: The loader package (indirect dep) caches compiled decoders; memory/safety implications are non-obvious. Go 1.24.0 breakage: Requires explicit -ldflags="-checklinkname=0" or Go version ≥1.24.1; CI may not catch this without explicit Go version matrix. CPU feature detection: Fallback paths for older CPUs are less tested—verify CPUID detection (via klauspost/cpuid) works on your target hardware. Testdata paths: Tests reference testdata_test.go files that embed large JSON fixtures inline; modifying test JSON requires recompilation.
💡Concepts to learn
- JIT (Just-In-Time) Compilation — Sonic's speed comes from compiling custom parsers at runtime rather than interpreting a generic state machine; the bytedance/sonic/loader dependency manages this complexity
- SIMD (Single-Instruction-Multiple-Data) — ast/asm.s and decoder Assembly exploit CPU vector instructions to process multiple JSON bytes in parallel; critical to 2–8× speedup claims
- Zero-Copy Parsing — The AST in ast/node.go holds references into the original JSON buffer rather than copying strings/values, reducing allocations and GC pressure significantly
- Visitor Pattern — ast/visitor.go implements the visitor pattern for traversing/transforming AST nodes; fundamental to how Sonic provides tree manipulation without code generation
- CPU Feature Detection / CPUID — klauspost/cpuid is used to detect which SIMD instruction sets (AVX2, SSE4, etc.) are available at runtime; Sonic selects optimal codepaths without recompilation
- Escape Sequence Handling in JSON — Decoding \uXXXX unicode escapes and backslash sequences is a performance bottleneck; Sonic optimizes this with SIMD in decoder/asm paths
- Binding-Based Marshaling vs. Reflection — Sonic supports both reflection-free binding (compiled decoders for known types) and generic reflection; binding codepath is much faster and shown in benchmarks (Binding_Sonic vs. Generic_Sonic)
🔗Related repos
json-iterator/go— Drop-in encoding/json replacement with reflection-based optimization; Sonic's direct performance competitor, often benchmarked against in testsgoccy/go-json— Another high-performance JSON library for Go using code generation; appears in benchmark.yml comparisons as a performance referencetidwall/gjson— JSON path/query library; complementary for users who want Sonic's parsing speed plus JSONPath-style navigationbytedance/gopkg— Bytedance's general utility library (imported as indirect dep); shared infrastructure for logging, reflection helpers, and CPU detectioncloudwego/base64x— Vectorized base64 codec used by Sonic's decoder for fast string handling; tight integration for encoding/escaping performance
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive ARM64 compatibility tests for SIMD operations
The repo supports ARM64 (mentioned in requirements for Go 1.20+) and has test-arm64.yml workflow, but there are no visible ARM64-specific test files in ast/ and decoder/ directories. Given that Sonic uses SIMD and JIT compilation, ARM64 NEON instructions need dedicated test coverage to ensure feature parity with AMD64. This is critical for users running on Apple Silicon or ARM servers.
- [ ] Create ast/asm_arm64_test.go with ARM64-specific SIMD edge cases (similar pattern to existing asm.s)
- [ ] Add decoder/testdata_arm64_test.go with ARM64-specific test cases for JIT compilation paths
- [ ] Create ast/parser_arm64_test.go testing ARM64-specific instruction sequences and alignment requirements
- [ ] Reference existing test patterns in ast/parser_test.go and decoder/decoder_native_test.go
Add streaming JSON parser tests for large file handling
The examples/example_stream_test.go file exists but streaming API test coverage appears minimal. Given Sonic's focus on performance with large JSON (mentioned in benchmarks), comprehensive streaming tests are needed to validate memory efficiency, buffer management, and correctness with real-world large datasets. This addresses a gap between example code and production readiness.
- [ ] Create examples/stream_large_file_test.go with test cases for >100MB JSON files from testdata
- [ ] Add ast/iterator_stream_test.go validating incremental parsing behavior and buffer reuse
- [ ] Implement memory profiling tests in encode_test.go and decode_test.go to catch streaming regressions
- [ ] Reference streaming patterns in examples/example_stream_test.go and ast/iterator.go
Add Windows-specific SIMD and linking tests with stricter validation
The repo has compatibility_test-windows.yml workflow, but Windows testing for SIMD edge cases and Go 1.24.0 linking issues (mentioned in README regarding -ldflags checklinkname=0) lacks dedicated test coverage. The current test structure doesn't validate Windows-specific loader behavior or SIMD instruction compatibility across Windows CPU variants.
- [ ] Create decoder/decoder_windows_test.go with Windows-specific loader and SIMD compatibility tests
- [ ] Add ast/asm_windows_test.go validating Windows calling conventions and SIMD register usage
- [ ] Create a windows_linkage_test.go at root level testing Go 1.24.0 compatibility and -ldflags workaround paths
- [ ] Reference sonic/loader package (v0.5.0) behavior in tests based on decoder/decoder_native.go patterns
🌿Good first issues
- Add fuzz tests for the
ast/iterator.goiterator—currentlyast/iterator_test.goexists but fuzzing.yml suggests fuzz coverage is incomplete. Contribute corpus and regression tests for edge cases like deeply nested arrays, unicode escapes. - Document the JIT compilation caching behavior in
api.goandcompat.go—there is no inline doc explaining when and how decoders are cached, which is critical for users debugging memory usage or goroutine behavior. - Add cross-platform Assembly tests in
ast/for arm64 (ci targets test-arm64.yml but no dedicated arm64 asm validation tests exist in the file list). Contribute unit tests that validate SIMD instructions under Go's test harness.
⭐Top contributors
Click to expand
Top contributors
- @liuq19 — 50 commits
- @AsterDY — 32 commits
- @bimoadityar — 3 commits
- @HeRaNO — 2 commits
- @equationzhao — 1 commits
📝Recent commits
Click to expand
Recent commits
4ddcd08— docs: update README Go 1.26 support (#931) (equationzhao)4c8f70b— chore: update loader v0.5.1 (#933) (AsterDY)d64ddf9— opt: unify JIT funcs in single moduledata onPretouch(#932) (AsterDY)3835c03— feat:(encoder) not omit zero value foromitemptytag (#927) (AsterDY)c9e5b0f— fix(rt): align map IndirectElem semantics across Go versions (#924) (liuq19)28040bd— revert: drop integer range mismatch and related tests (#922) (liuq19)f8ba977— fix(decoder): align jit string-tag mismatch with encoding/json (#917) (liuq19)9a1c148— fix(decoder): memory corruption when decode prefilled interface (#914) (liuq19)0724463— chore: use go fmt format (#913) (liuq19)f7c86b9— ci: add macOS ARM (Apple Silicon) runners to workflows (#911) (liuq19)
🔒Security observations
The Sonic JSON library codebase demonstrates reasonable security practices with no critical vulnerabilities identified. The primary concerns are around dependency management: some dependencies are outdated or no longer actively maintained (go-simplejson, json-iterator/go), and testing infrastructure uses an alpha version of the main library. The project includes proper CI/CD workflows (fuzzing, compatibility tests, linting) which is a positive security indicator. No hardcoded secrets, SQL injection risks, or exposed infrastructure were detected. Recommendations focus on keeping dependencies current and transitioning from alpha to stable releases.
- Medium · Outdated Dependency: json-iterator/go —
external_jsonlib_test/go.mod. The dependency github.com/json-iterator/go v1.1.12 is used. This version is relatively old and may contain known vulnerabilities. The latest versions should be reviewed for security patches. Fix: Update github.com/json-iterator/go to the latest stable version and review the changelog for security fixes. - Medium · Outdated Dependency: go-simplejson —
external_jsonlib_test/go.mod. The dependency github.com/bitly/go-simplejson v0.5.1 is relatively old and may have unpatched vulnerabilities. This library is no longer actively maintained. Fix: Review the latest version of go-simplejson or consider migrating to actively maintained alternatives. Update to the latest available version. - Low · Build Flag Recommendation in Documentation —
README.md. The README mentions that Go 1.24.0 requires a specific build flag-ldflags="-checklinkname=0"due to an upstream issue. This workaround could potentially bypass security checks. Fix: Monitor the Go issue #71672 and remove this workaround once it's resolved. Document the security implications of using checklinkname=0. - Low · Alpha Version Dependency —
external_jsonlib_test/go.mod. The external_jsonlib_test module depends on github.com/bytedance/sonic v1.11.5-alpha3, which is a pre-release/alpha version. Alpha versions may contain unstable code and unresolved security issues. Fix: Use stable releases for production testing. Only use alpha versions in isolated test environments. Switch to the latest stable release when available.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.