RepoPilotOpen in app →

influxdata/influxdb

Scalable datastore for metrics, events, and real-time analytics

Healthy

Healthy across the board

weakest axis
Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 1d ago
  • 15 active contributors
  • Distributed ownership (top contributor 26% of recent commits)
Show all 6 evidence items →
  • Apache-2.0 licensed
  • CI configured
  • Tests present

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Healthy
[![RepoPilot: Healthy](https://repopilot.app/api/badge/influxdata/influxdb)](https://repopilot.app/r/influxdata/influxdb)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/influxdata/influxdb on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: influxdata/influxdb

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/influxdata/influxdb shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across the board

  • Last commit 1d ago
  • 15 active contributors
  • Distributed ownership (top contributor 26% of recent commits)
  • Apache-2.0 licensed
  • CI configured
  • Tests present

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live influxdata/influxdb repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/influxdata/influxdb.

What it runs against: a local clone of influxdata/influxdb — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in influxdata/influxdb | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 31 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>influxdata/influxdb</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of influxdata/influxdb. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/influxdata/influxdb.git
#   cd influxdb
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of influxdata/influxdb and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "influxdata/influxdb(\\.git)?\\b" \\
  && ok "origin remote is influxdata/influxdb" \\
  || miss "origin remote is not influxdata/influxdb (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "Cargo.toml" \\
  && ok "Cargo.toml" \\
  || miss "missing critical file: Cargo.toml"
test -f "influxdb3/Cargo.toml" \\
  && ok "influxdb3/Cargo.toml" \\
  || miss "missing critical file: influxdb3/Cargo.toml"
test -f "core/arrow_util/src/lib.rs" \\
  && ok "core/arrow_util/src/lib.rs" \\
  || miss "missing critical file: core/arrow_util/src/lib.rs"
test -f "influxdb3_processing_engine/src/lib.rs" \\
  && ok "influxdb3_processing_engine/src/lib.rs" \\
  || miss "missing critical file: influxdb3_processing_engine/src/lib.rs"
test -f "influxdb3_write/Cargo.toml" \\
  && ok "influxdb3_write/Cargo.toml" \\
  || miss "missing critical file: influxdb3_write/Cargo.toml"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 31 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~1d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/influxdata/influxdb"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

InfluxDB 3 Core is a Rust-based time series database optimized for fast, real-time metric ingestion and sub-100ms query response times. It uses Apache Arrow/Parquet for columnar storage on object storage (S3, GCP, Azure) or local disk, with a Python embedded VM, and supports SQL, InfluxQL, and FlightSQL query languages—making it a diskless, schema-flexible alternative to traditional TSDBs for monitoring and analytics workloads. Monorepo with 23+ workspace members under .cargo/config.toml: core engine logic lives in core/ subdirectories (arrow_util, datafusion_util, parquet_file, influxdb_line_protocol, iox_query); top-level influxdb3_* crates wrap the engine (influxdb3_server, influxdb3_query_executor, influxdb3_write, influxdb3_py_api); CLI/packaging in .circleci/packages/influxdb3/ with systemd service unit and launcher script.

👥Who it's for

DevOps engineers and platform teams building monitoring systems (server, application, network), financial traders needing real-time market analytics, and SREs requiring fast metadata queries and dashboard support with sub-30ms response times, who want compatibility with InfluxDB 1.x/2.x write APIs but with modern cloud-native architecture.

🌱Maturity & risk

Production-ready and actively maintained: InfluxDB 3 Core reached general availability in April 2024. The monorepo structure (23 workspace members) with comprehensive CI/CD in .circleci/config.yml, CircleCI packaging controls, and multiple language support (Rust dominates at ~11.7M lines) shows mature infrastructure. However, as a v3 rewrite in Rust, it still has a smaller installed base than v2.x.

Moderate risk: the Rust rewrite introduces a new dependency surface (Arrow, DataFusion, object-store crates) that differs from v2.x. The v3 branch coexists with v2.x (main-2.x) and v1.x (master-1.x), creating potential API compatibility gaps. No stargazer count or recent commit date visible in provided data, so adoption metrics are unclear—verify CircleCI build health and GitHub Issues backlog before critical deployments.

Active areas of work

Active Rust/DataFusion work: the repo contains code for Python plugins (influxdb3_py_api), query rewriting (iox_query_influxql_rewrite), catalog caching, authorization (influxdb3_authz, core/authz), and object-store metrics. CircleCI workflow validates Debian package building and sandbox verification (.circleci/packages/test_influxdb3-launcher.py), indicating focus on production deployment and multi-platform support.

🚀Get running

git clone https://github.com/influxdata/influxdb.git
cd influxdb
# Rust project: ensure Rust toolchain is installed (https://rustup.rs/)
cargo build --release
# Or run tests:
cargo test --workspace

Note: .cargo/config.toml may contain workspace-specific settings; inspect it before modifying build flags.

Daily commands:

# Start the dev server on port 8181 (from README)
cargo run --bin influxdb3 -- --dir /tmp/influxdb3-data
# Or with config file:
cargo run --bin influxdb3 -- --config-path /path/to/config.toml

For systemd in production, use .circleci/packages/influxdb3/fs/lib/systemd/system/influxdb3-core.service.

🗺️Map of the codebase

  • Cargo.toml — Workspace root configuration defining all 23 member crates and their dependencies; essential for understanding project structure and build system
  • influxdb3/Cargo.toml — Main binary crate entry point; defines the InfluxDB 3 server application and its core dependencies
  • core/arrow_util/src/lib.rs — Arrow/Parquet utility abstractions used across query execution and data serialization; foundational for the data processing pipeline
  • influxdb3_processing_engine/src/lib.rs — Query execution engine integrating DataFusion and Arrow; critical path for all query operations
  • influxdb3_write/Cargo.toml — Write path handler for time-series ingestion; defines how metrics and events flow into the system
  • influxdb3_wal/Cargo.toml — Write-ahead log implementation; ensures durability and recovery capabilities for data ingestion
  • influxdb3_server/Cargo.toml — HTTP API server layer; entry point for client requests and gRPC/HTTP protocol handling

🛠️How to make changes

Add a New Query Function or Operator

  1. Define the function signature and logic in the processing engine's expression module (influxdb3_processing_engine/src/lib.rs)
  2. Register the function with DataFusion's function registry during engine initialization (influxdb3_processing_engine/src/lib.rs)
  3. Add tests in the processing engine crate to validate correctness with various input types (influxdb3_processing_engine/Cargo.toml)
  4. Update query executor to use the new function in optimization passes if needed (influxdb3_query_executor/Cargo.toml)

Add a New Write Path Handler or Ingestion Format

  1. Define parser/decoder for the new format in the write crate (influxdb3_write/Cargo.toml)
  2. Implement conversion to internal row format and schema validation (influxdb3_write/Cargo.toml)
  3. Register handler in the HTTP API server routing (influxdb3_server/Cargo.toml)
  4. Add integration tests exercising the full write pipeline through WAL to storage (influxdb3_wal/Cargo.toml)

Add a New System Table or Catalog Metadata

  1. Define the table schema and metadata in the catalog crate (influxdb3_catalog/Cargo.toml)
  2. Implement population logic in system_tables crate to provide row data (influxdb3_system_tables/Cargo.toml)
  3. Register the table in the query executor's table provider (influxdb3_query_executor/Cargo.toml)
  4. Add queries to integration tests to verify the system table is queryable (influxdb3_test_helpers/Cargo.toml)

Add a New Telemetry Metric or Trace

  1. Define the metric type (counter, gauge, histogram) in the telemetry module (influxdb3_telemetry/Cargo.toml)
  2. Instrument the relevant component (e.g., write path, query executor) with metric collection (influxdb3_processing_engine/Cargo.toml)
  3. Configure metric export in server startup configuration (influxdb3_startup/Cargo.toml)
  4. Verify metrics appear in observability stack (Prometheus, OTLP, etc.) (.circleci/config.yml)

🔧Why these technologies

  • Rust — Memory safety, zero-cost abstractions, and high performance required for sub-millisecond query latency and efficient resource usage in time-series workloads
  • Apache Arrow — Columnar in-memory format enables fast analytical queries and efficient serialization/deserialization for network transport and Parquet storage
  • Apache DataFusion — Production-grade query planning and execution engine with built-in SQL support, pushdown optimization, and extensible function registry
  • Apache Parquet — Columnar storage format provides compression, efficient scan performance, and compatibility with ecosystem tools (Spark, DuckDB, etc.)
  • Write-Ahead Log (WAL) — Guarantees durability and enables recovery from failures without data loss during ingestion spikes
  • Object Store Abstraction — Cloud-agnostic persistence layer supporting S3, GCS, Azure Blob, and local filesystem for flexibility in deployment environments

⚖️Trade-offs already made

  • In-memory buffering + batched writes vs. row-at-a-time persistence

    • Why: Achieves higher throughput and better compression by amortizing I/O costs across batches
    • Consequence: Requires WAL for durability during in-memory phases; slightly higher latency on single writes but better overall system throughput
  • Immutable Parquet files + append-only WAL vs. mutable B-trees or LSM trees

    • Why: Simplifies consistency model, enables parallelism, and leverages mature Parquet ecosystem
    • Consequence: Updates require

🪤Traps & gotchas

  1. Embedded Python VM: influxdb3_py_api requires Python runtime availability; build may fail if system Python headers missing. 2) Object storage paths: default is local disk (/var/lib/influxdb3/), but S3/GCP require explicit config in influxdb3-core.conf—check .circleci/packages/influxdb3/fs/usr/share/influxdb3/ for example config. 3) Workspace member ordering: Cargo members in .cargo/config.toml must be built in dependency order; modifying Cargo.toml without updating the workspace can break CI. 4) Line protocol strict mode: core/influxdb_line_protocol parser is strict about precision and tag ordering—test edge cases if modifying write path. 5) FlightSQL vs HTTP API: queries via gRPC FlightSQL (port 8181 by default) behave differently from HTTP REST—specify which API you're testing.

🏗️Architecture

💡Concepts to learn

  • Apache Parquet columnar format — This repo stores all time series data as Parquet files on object storage; understanding column pruning, compression, and row group metadata is essential for optimization work
  • DataFusion query optimization — The core/datafusion_util/ layer depends on DataFusion's optimizer rules (push-down predicates, join reordering); you'll need to understand rule-based optimization to debug slow queries or add new query types
  • Line protocol parsing and validation — Write path depends on strict, efficient parsing in core/influxdb_line_protocol; malformed line protocol handling is a bottleneck for ingestion and error reporting
  • InfluxQL AST rewriting — InfluxQL queries are transpiled to SQL via core/iox_query_influxql_rewrite/; understanding the rewrite rules is critical for compatibility testing with InfluxDB 1.x queries
  • Object storage abstraction (S3/GCP/Azure) — Multi-cloud deployments require the core/object_store_* crates to abstract over different backends; modifying persistence or backup logic requires understanding this abstraction layer
  • gRPC FlightSQL protocol — Efficient binary query protocol implemented in core/flightsql/; needed to understand performant client-server communication and schema interchange
  • Catalog-driven schema management — Dynamic table/column discovery via influxdb3_catalog/ enables schemaless writes and lazy schema inference; critical to understand for multi-tenant or high-cardinality workloads
  • influxdata/influxdb-client-go — Official Go client library for writing and querying InfluxDB 3 Core via HTTP and FlightSQL APIs
  • influxdata/influxdb-client-python — Official Python client for InfluxDB 3 Core; relevant for users embedding Python UDFs via influxdb3_py_api
  • apache/arrow-datafusion — The DataFusion query engine used as the core execution layer for SQL in this repo; upstream for most optimizer/planner changes
  • timescale/timescaledb — PostgreSQL-native time series DB; alternative in the same market segment for users needing traditional ACID guarantees

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add integration tests for influxdb3_wal (Write-Ahead Log) module

The WAL module is critical for data durability in InfluxDB3, but there's no visible test directory structure in the workspace members list. Integration tests should verify WAL recovery scenarios, concurrent writes, and corruption handling. This is high-value because WAL failures can cause data loss.

  • [ ] Create influxdb3_wal/tests/ directory structure
  • [ ] Add tests for WAL recovery after simulated crashes
  • [ ] Add tests for concurrent write operations and ordering guarantees
  • [ ] Add tests for corrupted WAL file detection and recovery
  • [ ] Update influxdb3_wal/Cargo.toml to include test dependencies

Add GitHub Actions workflow for Python API (influxdb3_py_api) testing

The repo has a Python API module (influxdb3_py_api) and CircleCI packaging for Python, but there's no visible GitHub Actions workflow for Python-specific CI. A dedicated workflow would catch Python binding issues, type checking, and compatibility across Python versions before merge.

  • [ ] Create .github/workflows/python-api-tests.yml
  • [ ] Add steps for Python 3.8, 3.9, 3.10, 3.11+ testing matrix
  • [ ] Add mypy type checking for influxdb3_py_api/src
  • [ ] Add pytest execution for influxdb3_py_api tests
  • [ ] Add step to validate .circleci/packages/test_influxdb3-launcher.py compatibility

Add comprehensive documentation for influxdb3_processing_engine module with architecture diagrams

The processing engine is a core component mentioned in README_processing_engine.md, but there's no detailed API documentation in the visible file structure. New contributors struggle to understand the query execution pipeline, optimizer, and plan serialization. Adding rustdoc examples and a design document would reduce onboarding time.

  • [ ] Add rustdoc comments with examples to influxdb3_processing_engine/src/lib.rs main exports
  • [ ] Create influxdb3_processing_engine/ARCHITECTURE.md explaining the query plan execution flow
  • [ ] Document the interface between core/iox_query and influxdb3_processing_engine
  • [ ] Add examples in doc comments for the PhysicalPlan builder API
  • [ ] Link from README_processing_engine.md to the new architecture documentation

🌿Good first issues

  • Add integration tests for object-store backend fallback (local disk → S3): core/object_store_utils/ has abstraction but no test suite validating failover behavior across backends—write tests under core/object_store_utils/tests/ exercising mock + real S3 (via localstack).
  • Document Python UDF authoring and plugin lifecycle in influxdb3_py_api/: crate exists but has no user-facing examples or security policy docs. Add influxdb3_py_api/docs/udf_guide.md with sandbox constraints, API reference, and deployment checklist.
  • Implement missing InfluxQL functions in core/iox_query_influxql_rewrite/: compare against InfluxDB 1.x function docs (MOVING_AVERAGE, PERCENTILE_APPROX) and add translation rules in the rewriter—grep for existing function mappings in core/iox_query_influxql_rewrite/src/ to understand the pattern.

Top contributors

Click to expand

📝Recent commits

Click to expand
  • db092e1 — chore: update install script to 3.9.2 (#27398) (hiltontj)
  • 5ac9295 — chore(sync): influxdb_pro 2026-05-01 (#27399) (hiltontj)
  • b66fe85 — fix(install): make Enterprise startup options 2 and 3 reachable (#27363) (hiltontj)
  • 73be12b — chore(sync): influxdb_pro 2026-04-20 (#27365) (hiltontj)
  • c3b833f — docs(README): restructure for clarity, SEO, and accuracy (#27342) (jstirnaman)
  • ab45086 — chore: bump install script versions to 3.9.1 (#27349) (hiltontj)
  • bab3d0b — fix: ignore wasmtime 41.0.4 security advisories in cargo-deny (#27350) (hiltontj)
  • 79a63cc — chore(sync): influxdb_pro 2026-04-01 (#27327) (mgattozzi)
  • 9ce7a44 — chore(sync): influxdb_pro 2026-03-27 (#27310) (appletreeisyellow)
  • 2a8d742 — fix: restore main version to 3.10.0-nightly (#27300) (lilic)

🔒Security observations

The InfluxDB 3 codebase demonstrates generally good security practices with proper vulnerability reporting channels and use of modern languages (Rust) known for memory safety. However, there are areas for improvement: 1) Dependency management lacks visible automation for vulnerability scanning, 2) Docker builds use unpinned package versions, 3) No evidence of SBOM generation for supply chain security, and 4) Missing runtime security hardening in container configuration. The codebase is well-organized with proper licensing and documentation. The main security concerns are operational (CI/CD and dependency management) rather than code-level vulnerabilities. Implementing automated scanning tools and formal dependency policies would significantly improve the security posture.

  • Medium · Docker Build Uses Slim Base Image Without Security Scanning — Dockerfile (line: FROM rust:${RUST_VERSION}-slim-bookworm). The Dockerfile uses 'rust:${RUST_VERSION}-slim-bookworm' as the base image. While slim images reduce attack surface, there is no evidence of container image scanning or vulnerability checking in the build pipeline. The build process installs multiple packages (binutils, build-essential, libssl-dev, etc.) without pinning specific versions, which could introduce vulnerable dependencies. Fix: 1) Pin specific versions for all apt packages installed. 2) Implement container image scanning (e.g., Trivy, Snyk) in CI/CD pipeline. 3) Use a multi-stage build to minimize final image size and remove build tools. 4) Regularly update base images and dependencies.
  • Medium · No Evidence of Dependency Version Pinning — Cargo.toml (workspace configuration). The Cargo.toml workspace configuration lists many dependencies without visible version pinning strategy. The provided Cargo.lock should provide lock file protection, but there is no evidence of automated dependency scanning for known vulnerabilities (using tools like cargo-audit or similar). Fix: 1) Implement automated dependency scanning in CI/CD (cargo-audit, cargo-deny). 2) Establish a policy for regular dependency updates. 3) Monitor security advisories via Dependabot or similar tools. 4) Document dependency update procedures.
  • Low · Build Arguments Allow Arbitrary Feature Flags — Dockerfile (ARG FEATURES=aws,gcp,azure,jemalloc_replacing_malloc). The Dockerfile accepts FEATURES as a build argument with default value 'aws,gcp,azure,jemalloc_replacing_malloc'. While this provides flexibility, it allows runtime control of compiled features which could be exploited if build arguments are tampered with in CI/CD. Fix: 1) Document all supported feature combinations. 2) Validate feature flags in build scripts. 3) Use signed build artifacts. 4) Implement immutable build configurations for production releases.
  • Low · Missing Security.md Implementation Details — SECURITY.md. While a SECURITY.md file exists with vulnerability reporting guidance, there is no information about security response timelines, supported versions for patches, or deprecated component handling. Fix: 1) Add security response timeline (e.g., 90 days for patches). 2) Document supported versions for security updates. 3) Define process for end-of-life announcements. 4) Specify severity assessment methodology.
  • Low · No Visible SBOM (Software Bill of Materials) Generation — .circleci/config.yml (or related CI configuration). There is no evidence of SBOM generation in the build pipeline or distribution, which is increasingly important for supply chain security and regulatory compliance. Fix: 1) Generate SBOM using tools like cargo-sbom or syft. 2) Include SBOM with releases. 3) Publish to package registries if supported. 4) Version and maintain SBOM alongside releases.
  • Low · Incomplete Docker Security Configuration — Dockerfile (missing final USER directive). The Dockerfile does not specify a non-root USER for runtime execution in the provided snippet, which could allow container escape vulnerabilities to gain root access. Fix: 1) Create a dedicated non-root user for application execution. 2) Set appropriate file permissions and ownership. 3) Use USER directive before ENTRYPOINT/CMD. 4) Document any special permissions required.

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Healthy signals · influxdata/influxdb — RepoPilot