neondatabase/neon
Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
Healthy across the board
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 6w ago
- ✓33+ active contributors
- ✓Distributed ownership (top contributor 10% of recent commits)
Show all 6 evidence items →Show less
- ✓Apache-2.0 licensed
- ✓CI configured
- ✓Tests present
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/neondatabase/neon)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/neondatabase/neon on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: neondatabase/neon
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/neondatabase/neon shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across the board
- Last commit 6w ago
- 33+ active contributors
- Distributed ownership (top contributor 10% of recent commits)
- Apache-2.0 licensed
- CI configured
- Tests present
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live neondatabase/neon
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/neondatabase/neon.
What it runs against: a local clone of neondatabase/neon — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in neondatabase/neon | Confirms the artifact applies here, not a fork |
| 2 | License is still Apache-2.0 | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 73 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of neondatabase/neon. If you don't
# have one yet, run these first:
#
# git clone https://github.com/neondatabase/neon.git
# cd neon
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of neondatabase/neon and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "neondatabase/neon(\\.git)?\\b" \\
&& ok "origin remote is neondatabase/neon" \\
|| miss "origin remote is not neondatabase/neon (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
&& ok "license is Apache-2.0" \\
|| miss "license drift — was Apache-2.0 at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "Cargo.toml" \\
&& ok "Cargo.toml" \\
|| miss "missing critical file: Cargo.toml"
test -f "pageserver/src/lib.rs" \\
&& ok "pageserver/src/lib.rs" \\
|| miss "missing critical file: pageserver/src/lib.rs"
test -f "proxy/src/lib.rs" \\
&& ok "proxy/src/lib.rs" \\
|| miss "missing critical file: proxy/src/lib.rs"
test -f "safekeeper/src/lib.rs" \\
&& ok "safekeeper/src/lib.rs" \\
|| miss "missing critical file: safekeeper/src/lib.rs"
test -f "storage_controller/src/lib.rs" \\
&& ok "storage_controller/src/lib.rs" \\
|| miss "missing critical file: storage_controller/src/lib.rs"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 73 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~43d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/neondatabase/neon"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
Neon is a serverless PostgreSQL database platform that decouples storage and compute by replacing PostgreSQL's native storage layer with a distributed storage engine across a cluster of nodes. It enables autoscaling, instant database branching (like Git for databases), and scale-to-zero capability for PostgreSQL workloads without managing infrastructure. Modular Rust monorepo: pageserver/ contains the distributed storage backend, safekeeper/ is the redundant WAL service, compute_tools/ wraps PostgreSQL compute, proxy/ handles connection routing, and control_plane/ manages orchestration. Supporting libraries under libs/ provide shared APIs (pageserver_api, safekeeper_api, remote_storage) and low-level utilities (postgres_ffi, wal_decoder, desim for simulation testing).
👥Who it's for
Backend engineers and DevOps teams building serverless applications who want PostgreSQL compatibility without provisioning dedicated database instances, and database developers who need to branch databases for testing and CI/CD workflows like feature branches in Git.
🌱Maturity & risk
Actively developed and production-ready: the repo has 11.2M lines of Rust code, comprehensive CI/CD workflows across 60+ GitHub Actions files, dense test coverage via nextest configuration, and a commercial Neon Cloud offering. Last commits and active PR merges indicate ongoing development with regular releases.
Non-trivial operational complexity: the system requires coordinating three major stateful components (Pageserver, Safekeepers, Compute nodes) across Rust/Python/C with 40+ workspace crates. Risk factors include the experimental distributed storage implementation (new compared to PostgreSQL's proven storage), heavy dependency on cloud storage APIs (AWS S3, KMS, IAM), and potential breaking changes during serverless storage optimization.
Active areas of work
Active development on storage optimization, compute autoscaling, and reliability: multiple workflow files show benchmarking pipelines (benchbase_tpcc.yml, benchmarking.yml), CI testing harness updates (build_and_test_fully.yml), and infrastructure-as-code via Jsonnet. The presence of storage_scrubber and storage_controller crates indicates ongoing work on data consistency and tiering.
🚀Get running
git clone https://github.com/neondatabase/neon.git
cd neon
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh # Install Rust
# On Ubuntu: apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev libssl-dev clang pkg-config libpq-dev cmake postgresql-client protobuf-compiler libprotobuf-dev libcurl4-openssl-dev openssl python3-poetry lsof libicu-dev
echo 'See README for macOS/Fedora/Arch instructions'
cargo build --release # Build all workspace crates
Daily commands:
cargo build --release
cargo test --workspace # Run test suite via nextest config in .config/nextest.toml
# For local dev setup, see SUMMARY.md; typically: pageserver, safekeeper, and compute node must run as separate services
🗺️Map of the codebase
Cargo.toml— Workspace root defining all member crates (pageserver, proxy, safekeeper, storage_controller, etc.) and shared dependencies—essential for understanding the monorepo structure.pageserver/src/lib.rs— Pageserver entry point and core module orchestration; implements the scalable storage backend that compute nodes depend on.proxy/src/lib.rs— Proxy layer handling connection routing and tenant isolation; critical for stateless compute node access patterns.safekeeper/src/lib.rs— Safekeeper WAL service implementation; forms the redundant WAL layer that guarantees durability before pageserver persistence.storage_controller/src/lib.rs— Storage controller orchestrating lifecycle of pageserver and safekeeper instances; manages cluster-wide storage topology.control_plane/src/lib.rs— Control plane API and management logic for compute node provisioning and tenant branching features.libs/pageserver_api/src/lib.rs— Shared API contracts between compute, pageserver, and control plane; defines core request/response types and storage primitives.
🛠️How to make changes
Add a new storage API endpoint
- Define the request/response type in the pageserver_api crate (
libs/pageserver_api/src/lib.rs) - Implement the handler in the pageserver request processing loop (
pageserver/src/lib.rs) - Add integration tests exercising the endpoint with compute nodes (
tests/) - Update OpenAPI schema if exposing as HTTP/gRPC endpoint (
pageserver/page_api)
Add a new control plane management operation
- Define the API endpoint and request model in control_plane (
control_plane/src/lib.rs) - Implement the operation handler, coordinating with storage_controller for cluster changes (
storage_controller/src/lib.rs) - Update the compute provisioning logic if the operation affects instance lifecycle (
compute_tools) - Add e2e test in the test suite validating the full operation flow (
.github/workflows/build_and_test.yml)
Add a new proxy routing rule or tenant isolation feature
- Extend the connection parsing and tenant extraction logic in the proxy (
proxy/src/lib.rs) - Update the routing decision logic to apply the new rule (
proxy/src/lib.rs) - Add unit tests for the routing logic covering edge cases (
proxy/tests) - Add load test if the change impacts connection throughput or latency (
.github/workflows/proxy-benchmark.yml)
Add a new metric or observability signal
- Define the metric type and export it from libs/utils or component-specific modules (
libs/utils) - Instrument the code location where the metric should be recorded (
pageserver/src/lib.rs or proxy/src/lib.rs (as applicable)) - Register the metric in the Prometheus exporter configuration (
compute/etc/postgres_exporter.yml) - Add assertions in benchmarking workflows to track regressions (
.github/workflows/benchmarking.yml)
🔧Why these technologies
- Rust (entire codebase) — Memory safety and performance-critical for storage, WAL, and proxy layers; prevents data corruption bugs in distributed system
- PostgreSQL (embedded in compute) — Proven relational database engine; Neon replaced only storage layer while reusing PostgreSQL protocol and execution
- gRPC + Protocol Buffers (pageserver, safekeeper APIs) — Efficient binary protocol for high-throughput storage communication; typed contract enforcement across services
- Tokio async runtime — Handles thousands of concurrent connections in proxy and I/O-bound storage operations without thread overhead
- S3-compatible object storage — Decouples data durability from compute/pageserver availability; enables cost-effective long-term archival
⚖️Trade-offs already made
-
Separated storage and compute into independent layers
- Why: Enables autoscaling compute without rebalancing data; supports code-like branching by sharing storage snapshots
- Consequence: Requires efficient pageserver protocol and network bandwidth; adds latency for page misses vs. local PostgreSQL
-
Safekeeper WAL service before pageserver durability
- Why: Guarantees commit durability immediately after replication; allows compute to continue without waiting for pageserver flush
- Consequence: Adds operational complexity; requires consensus protocol; pageserver must replay WAL after restarts
-
Stateless compute nodes (no local state)
- Why: Enables fast scale-to-zero, live migration, and horizontal scaling without consistency concerns
- Consequence: All data must be fetched from storage; cannot exploit compute-local caching optimizations
-
Proxy as connection router rather than transparent proxy
- Why: Allows fine-grained tenant isolation, connection pooling per tenant, and dynamic load balancing
- Consequence: Proxy is on critical path; latency and availability of proxy directly impacts all clients
🚫Non-goals (don't propose these)
- Real-time analytics or columnar storage optimizations (OLAP-specific features)
- Multi-region replication out-of-the-box (region selection at project level, but no automatic cross-region sync)
- Binary-compatible PostgreSQL extensions running on compute (only PostgreSQL-compatible wire protocol)
- Replacing PostgreSQL query engine or planner (uses vanilla PostgreSQL for execution)
🪤Traps & gotchas
- Multiple cooperating services: Cannot test pageserver in isolation—must run safekeeper + pageserver + proxy together for realistic behavior; see SUMMARY.md for local setup order. 2. Protobuf regeneration: Changes to
*.protofiles require explicit code generation (check build scripts in affected crates). 3. WAL format coupling: pageserver and compute must agree on PostgreSQL WAL format; postgres_ffi version mismatches cause silent corruption. 4. Cloud storage credentials: Remote storage tests need AWS/Azure credentials in env; local tests often require-F integrationfeature flag and local S3 (MinIO). 5. TLA+ specs in libs/desim/: Distributed algorithm changes need formal verification updates—not optional for consensus/replication logic. 6. Rust edition 2024 preview: Workspace uses early Rust edition; may require nightly toolchain for some builds.
🏗️Architecture
💡Concepts to learn
- Write-Ahead Logging (WAL) and Safekeepers — Neon's durability model depends on Safekeepers capturing WAL before Pageserver processes it; understanding WAL is fundamental to why Neon can decouple storage and compute without losing safety
- Layered storage and page versioning — Pageserver organizes data into immutable layers (L0/L1) with version snapshots; this enables time-travel queries and efficient tiering but is non-obvious in the code
- Tenant isolation and multi-tenancy — Each Neon project is a tenant with its own keyspace and compute node; understanding tenant routing in proxy and storage_controller is critical for security and performance isolation
- Consensus-based storage placement (Raft-like) — storage_controller uses consensus to decide where pageserver shards live; this is non-trivial distributed state management, partially validated via TLA+ specs in desim
- Page server protocol and gRPC — Compute nodes fetch pages from pageserver via gRPC (libs/pageserver_api); understanding the request/response model is essential for performance tuning and debugging
- Discrete Event Simulation (DES) — libs/desim allows testing distributed algorithms without real concurrency; Neon uses this to verify consensus and replication correctness without flakiness
- Remote storage abstraction (S3/Azure/local) — libs/remote_storage provides pluggable cloud storage backends; Pageserver offloads old layers to remote storage for durability and cost optimization, making this layer critical for scale
🔗Related repos
postgres/postgres— Upstream PostgreSQL repository; Neon wraps and extends PostgreSQL without forking the main codebase, so understanding PG internals is essential for pageserver/compute integrationtembo-io/tembo— Alternative Postgres-as-a-service using Kubernetes; different approach to serverless Postgres but same target user (engineers wanting managed Postgres without operations burden)supabase/supabase— Open-source Firebase alternative built on PostgreSQL; complementary ecosystem where Neon could serve as the storage backend for autoscaling Postgres instancesankane/pgvector— PostgreSQL vector extension for embedding storage; increasingly integrated with serverless Postgres workloads like Neon for AI/ML applicationsMaterializeInc/materialize— Streaming SQL engine built on persistent storage abstractions similar to Neon's layer model; related design patterns for efficient incremental computation over distributed pages
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add missing integration tests for GitHub Actions in .github/actions/
The repo has 12+ custom GitHub Actions (.github/actions/neon-branch-create, neon-project-create, etc.) but no visible test workflow validating their inputs/outputs. This is critical for a Postgres platform where infrastructure automation is core. Adding a workflow that tests action.yml schemas and basic execution would prevent silent failures in CI pipelines.
- [ ] Create .github/workflows/_test-custom-actions.yml to validate all action.yml files in .github/actions/
- [ ] Add JSON schema validation for required inputs/outputs in each action.yml
- [ ] Create integration tests for at least neon-branch-create, neon-branch-delete, neon-project-create actions
- [ ] Document expected inputs/outputs in .github/actions/*/README.md files (currently missing)
Add comprehensive test suite for Cargo workspace dependencies consistency
With 40+ workspace members and centralized dependency management in [workspace.dependencies], there's no visible linting to catch version mismatches or unused dependencies. The repo uses hakari (in .config/hakari.toml) but no CI workflow enforces it. A new workflow checking dependency graph consistency would prevent supply chain issues in this distributed storage system.
- [ ] Create .github/workflows/_check-cargo-dependencies.yml
- [ ] Add cargo-deny policy file check (referenced in cargo-deny.yml but incomplete configuration)
- [ ] Run cargo-hack to detect unused dependencies across all workspace members
- [ ] Validate hakari workspace-hack is in sync with actual dependencies
- [ ] Add step to check for version mismatches between workspace.dependencies and Cargo.tomls
Create missing Python test configuration documentation and validation workflow
The repo has extensive Python testing infrastructure (.github/actions/run-python-test-set, .config/nextest.toml for Python, multiple Python workflows) but no .github/workflows/_check-codestyle-python.yml validation equivalent to the Rust version. The current .config/nextest.toml and Python test actions lack documented standards. Adding a linting/typing workflow with configuration would catch issues early.
- [ ] Create .github/workflows/_check-python-types.yml using mypy/pyright on all Python code
- [ ] Add pyproject.toml at repo root with shared Python tool config (black, isort, ruff) if missing
- [ ] Document Python test patterns in contributing guide or test directory READMEs
- [ ] Integrate mypy type checking into pre-merge-checks.yml workflow
- [ ] Add Python dependency linting similar to cargo-deny (using pip-audit or similar)
🌿Good first issues
- Add comprehensive integration tests for the proxy's connection pooling logic under
proxy/tests/—currently only unit tests exist, but real tenant routing with multiple concurrent connections is under-tested and a good place to catch edge cases. - Document the WAL compaction strategy in
pageserver/compaction/with examples and diagrams indocs/SUMMARY.md—the code exists but onboarding new contributors is hindered by lack of written explanation of layer compaction policies and their impact on query latency. - Implement error metrics tracking in
libs/metrics/for the storage_controller's tenant placement failures—currently only success metrics are emitted, making it hard to debug placement issues in production-like scenarios.
⭐Top contributors
Click to expand
Top contributors
- @hlinnaka — 10 commits
- @conradludgate — 10 commits
- @tristan957 — 10 commits
- @erikgrinaker — 8 commits
- @thesuhas — 6 commits
📝Recent commits
Click to expand
Recent commits
6a35a3e— HCC, resolved GCS upload permit deadlock, SK generation delete bug-fix. (#12873) (johngcrowley)39e4f23— GCS Provider Bytes Range Headers (#12855) (johngcrowley)489c7a2— Update README logo and links from neon.tech to neon.com (#12850) (ruf-io)015b1c7— Update README (#12827) (kelvich)5e85c02— neon_local: fix mismatched comment about local SSL certificate generation (#12814) (dotdister)c17d3fe— Fix typos (#12819) (hyongtao-code)4ac447c— fix(control_plane): Fix incorrect file path of identity.toml in error message (#12826) (dotdister)26b47b5— feat: Add configurable Direct IO alignment support (#12821) (levin-kitty)85ce109— Initial implementation of GCS provider. (#11666) (johngcrowley)77e22e4— remove obsolete comment - this is a dummy commit (#12816) (Bodobolero)
🔒Security observations
- High · Incomplete Dependency List - Potential Unvetted Dependencies —
Cargo.toml - [workspace.dependencies] section. The Cargo.toml workspace.dependencies section appears to be truncated at 'aws-' in the provided content. This prevents complete analysis of all project dependencies. Given the project's nature (distributed database system), missing visibility into the complete dependency tree could hide vulnerable or unvetted crates, particularly in cryptography, networking, and data handling libraries. Fix: Provide complete dependency manifest. Implement regular cargo-deny audits in CI/CD pipeline. Review all dependencies for security advisories using 'cargo audit' and maintain a Software Bill of Materials (SBOM). Consider using cargo-tree to analyze transitive dependencies. - High · Missing Security Configuration Files —
Repository root. No evidence of security configuration files such as SECURITY.md, security policy documentation, or vulnerability disclosure guidelines in the repository root. Critical for coordinated vulnerability reporting in a distributed infrastructure project handling sensitive data (Postgres databases). Fix: Create SECURITY.md with vulnerability disclosure process, responsible disclosure timeline, and contact information. Implement a security advisory process. Add security scanning to CI/CD (cargo-deny, cargo-audit). - High · Unverified Docker Base Image Digest References —
Dockerfile. The Dockerfile contains commented-out instructions for retrieving image digests with authentication tokens but doesn't show actual pinned digests for the base images (debian:bookworm-slim). While the Dockerfile shows awareness of digest pinning (ARG IMAGE=build-tools, ARG TAG=pinned), the actual implementation is incomplete, leaving room for base image manipulation. Fix: Pin all base images by SHA256 digest instead of tags. Example: 'FROM debian:bookworm-slim@sha256:...' Document the digest verification process. Implement image scanning in CI/CD using tools like Trivy or Grype. - Medium · AWS SDK Dependencies Without Explicit Security Hardening —
Cargo.toml - AWS dependencies (aws-config, aws-sdk-*). Multiple AWS SDK dependencies are declared (aws-sdk-s3, aws-sdk-iam, aws-sdk-kms) with versions that should be carefully monitored. KMS and IAM SDKs handling credentials require strict access control. No explicit mention of credential rotation policies or secret management strategy visible in the provided configuration. Fix: Enforce credential rotation policies (max 90 days). Use IAM roles instead of long-lived credentials. Implement audit logging for all AWS SDK calls. Use aws-vault or similar tools for local development. Enable MFA for production deployments. Regular security audits of IAM policies. - Medium · Complex Multi-Workspace Project Without Visible Access Control —
Cargo.toml workspace members, CODEOWNERS. The project contains 40+ workspace members including sensitive components (storage_controller, safekeeper, pageserver with client access). CODEOWNERS file exists but visibility into granular access control per component is unclear. Risk of unauthorized or accidental modifications to critical security components. Fix: Document CODEOWNERS policies per security tier. Implement branch protection rules requiring multiple approvals for sensitive components. Add audit logging for all changes to crypto, auth, and storage components. Consider code signing requirements. - Medium · Potential Unsafe Dependencies in Core Libraries —
libs/postgres_ffi, libs/postgres_ffi_types. Libraries like 'postgres_ffi' and 'postgres_ffi_types' interact with PostgreSQL internals. FFI (Foreign Function Interface) code is inherently high-risk for memory safety issues. Without visible unsafe code review process or fuzz testing, potential for buffer overflows or memory corruption. Fix: Conduct formal security audits of all FFI code. Implement fuzzing for FFI interfaces (libFuzzer, AFL++). Use AddressSanitizer and MemorySanitizer in CI. Document all unsafe blocks with security justifications. Consider Miri testing for additional safety checks. - Medium · Cryptography Libraries Require Version Pinning Verification —
Cargo.toml dependencies. No explicit cryptography dependencies visible in the truncated dependency list. Given this is a database system handling encryption (noted by aws-sdk-kms), missing explicit crypto dependency declarations suggest either transitive dependencies (higher risk) or incomplete configuration display. Fix: Explicitly declare all cryptography crates (ring, rustls, etc
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.