RepoPilotOpen in app →

NLPchina/elasticsearch-sql

Use SQL to query Elasticsearch

Healthy

Healthy across all four use cases

weakest axis
Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 4d ago
  • 4 active contributors
  • Apache-2.0 licensed
Show all 7 evidence items →
  • CI configured
  • Tests present
  • Small team — 4 contributors active in recent commits
  • Single-maintainer risk — top contributor 97% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Healthy
[![RepoPilot: Healthy](https://repopilot.app/api/badge/nlpchina/elasticsearch-sql)](https://repopilot.app/r/nlpchina/elasticsearch-sql)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/nlpchina/elasticsearch-sql on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: NLPchina/elasticsearch-sql

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/NLPchina/elasticsearch-sql shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

  • Last commit 4d ago
  • 4 active contributors
  • Apache-2.0 licensed
  • CI configured
  • Tests present
  • ⚠ Small team — 4 contributors active in recent commits
  • ⚠ Single-maintainer risk — top contributor 97% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live NLPchina/elasticsearch-sql repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/NLPchina/elasticsearch-sql.

What it runs against: a local clone of NLPchina/elasticsearch-sql — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in NLPchina/elasticsearch-sql | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 34 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>NLPchina/elasticsearch-sql</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of NLPchina/elasticsearch-sql. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/NLPchina/elasticsearch-sql.git
#   cd elasticsearch-sql
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of NLPchina/elasticsearch-sql and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "NLPchina/elasticsearch-sql(\\.git)?\\b" \\
  && ok "origin remote is NLPchina/elasticsearch-sql" \\
  || miss "origin remote is not NLPchina/elasticsearch-sql (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "pom.xml" \\
  && ok "pom.xml" \\
  || miss "missing critical file: pom.xml"
test -f "README.md" \\
  && ok "README.md" \\
  || miss "missing critical file: README.md"
test -f "src/_site/index.html" \\
  && ok "src/_site/index.html" \\
  || miss "missing critical file: src/_site/index.html"
test -f "src/_site/query.js" \\
  && ok "src/_site/query.js" \\
  || miss "missing critical file: src/_site/query.js"
test -f "BUILDING.md" \\
  && ok "BUILDING.md" \\
  || miss "missing critical file: BUILDING.md"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 34 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~4d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/NLPchina/elasticsearch-sql"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

elasticsearch-sql is a plugin/library that translates SQL queries into Elasticsearch Query DSL, enabling users to query Elasticsearch indices using familiar SQL syntax instead of JSON-based queries. It parses SQL statements and compiles them into native Elasticsearch aggregations and filters, supporting SELECT, WHERE, GROUP BY, JOIN, and other standard SQL operations against ES indices. Maven-based monolith structured as src/main/ for Java plugin code and src/_site/ for web UI. The web frontend (index.html, controllers.js, editor.js) is an AngularJS 1.x application with Bootstrap styling under src/_site/, while backend Java plugin code handles SQL parsing and ES query compilation. Test resources under src/_site/tests/ include spec files for ResultHandlers and TablePresenter.

👥Who it's for

Database administrators and data engineers migrating from traditional SQL databases to Elasticsearch, or teams building analytics dashboards who need SQL query capability without learning Elasticsearch Query DSL. Also useful for BI tool integrations that expect SQL interfaces.

🌱Maturity & risk

The project is deprecated and no longer actively maintained as of the README notice—Elastic recommends using official x-pack-sql or AWS OpenDistro SQL instead. However, it does have stable branch builds for ES 6.8.23, 7.17.28, 8.19.15, and 9.3.4, suggesting it was production-ready before deprecation. CI is set up via Travis CI and GitHub Actions, but commit recency is unknown from provided data.

High risk: the project is officially deprecated with an explicit notice in README.md recommending users switch to official alternatives. Dependency risk is moderate (druid 1.2.15, guava 32.0.0-jre for ES 9.3.4), but single-maintainer history (ansj as primary) combined with deprecation status means security patches unlikely. Breaking changes across ES versions (separate branches for 6.8.x, 7.x, 8.x, 9.x) indicate ongoing compatibility churn.

Active areas of work

No active development—project is in maintenance mode on deprecated notice. The latest version is 9.3.4.0 with support branches for ES 9.3.4, 8.19.15, 7.17.28, and 6.8.23, suggesting last work was version-bump maintenance for ES compatibility, not feature work.

🚀Get running

Clone the repo: git clone https://github.com/NLPchina/elasticsearch-sql.git. Build with Maven: mvn clean package. Deploy as an Elasticsearch plugin: bin/elasticsearch-plugin install file:///path/to/elasticsearch-sql-x.x.x.0.zip. Access the web UI via http://localhost:9200/_site/ after startup.

Daily commands: Build: mvn clean package. Install as Elasticsearch plugin: bin/elasticsearch-plugin install file:///path/to/built/zip. Start Elasticsearch: bin/elasticsearch. Query via HTTP POST to /_nlpcn/sql (or legacy /_sql in older versions) or access web UI at http://localhost:9200/_site/.

🗺️Map of the codebase

  • pom.xml — Maven build configuration defining all dependencies, plugins, and module structure for the elasticsearch-sql project across multiple ES versions
  • README.md — Project overview indicating deprecation status, version compatibility (6.8–9.3.4), and API path changes that all contributors must understand
  • src/_site/index.html — Main web UI entry point for the SQL query editor interface that end users interact with
  • src/_site/query.js — Core JavaScript module handling SQL query submission and result formatting to Elasticsearch
  • BUILDING.md — Build and development setup instructions essential for contributors to compile and test across different Elasticsearch versions
  • .github/workflows/actions.yml — CI/CD pipeline configuration defining automated testing and build validation for the master branch

🧩Components & responsibilities

  • SQL Query Editor (CodeMirror) (CodeMirror, JavaScript) — Provides syntax-aware text editing with hints and linting for SQL input
    • Failure mode: Malformed SQL passes through to backend; validation errors returned from Elasticsearch
  • Query Parser & DSL Builder (query.js) (JavaScript) — Converts SQL text to Elasticsearch JSON Query DSL; handles SELECT, WHERE, GROUP BY, aggregations
    • Failure mode: Unparseable SQL or unsupported operators result in POST failure; Elasticsearch returns 400 error
  • Result Formatter (table_presenter.js) (JavaScript, HTML) — Transforms Elasticsearch nested aggregation and hit responses into flat HTML tables for display
    • Failure mode: Unexpected response structure causes rendering errors; unformatted JSON displayed as fallback
  • AngularJS Controller (controllers.js) (AngularJS 1.x, HTTP service) — Orchestrates UI state, handles form submission, HTTP communication to Elasticsearch plugin
    • Failure mode: Network errors or ES plugin unavailability shows HTTP error message; UI becomes unresponsive
  • **** — undefined

🛠️How to make changes

Add a new SQL function or operator

  1. Implement parser rules in the core SQL grammar module (part of the backend Java codebase not fully visible in file list) (pom.xml (add test dependency if needed))
  2. Update the CodeMirror SQL hints to recognize the new keyword (src/_site/vendor/codemirror/addon/hint/sql-hint.js)
  3. Add test case in the test resources directory with sample input/output (src/_site/tests/resources/expectedBody4simpleQueryResult.json)

Modify the query editor UI

  1. Update HTML structure and styling for editor layout (src/_site/index.html)
  2. Add AngularJS event handlers and scope updates (src/_site/controllers.js)
  3. Adjust CodeMirror editor configuration if needed (src/_site/editor.js)
  4. Update CSS styling for new editor elements (src/_site/style.css)

Add a new aggregation or result transformation

  1. Create test fixture with sample Elasticsearch aggregation response (src/_site/tests/resources/expectedBody4statsAgg.json)
  2. Implement result handler logic in ResultHandlers or custom handler (src/_site/query.js)
  3. Update TablePresenter to format the new result type for display (src/_site/table_presenter.js)
  4. Add Jasmine unit test for the new transformation (src/_site/tests/spec/ResultHandlersSpec.js)

🔧Why these technologies

  • AngularJS 1.x — Lightweight MVC framework for two-way data binding between UI and SQL query state; simple HTTP service for Elasticsearch communication
  • CodeMirror — Feature-rich text editor with syntax highlighting, SQL-aware hints, and linting for improved SQL authoring experience
  • Bootstrap — Responsive CSS framework providing grid layout, components, and mobile-friendly design for the web interface
  • Maven — Build automation and dependency management supporting multiple Elasticsearch versions (6.8, 7.x, 8.x, 9.x) as separate branches
  • Java (backend, not shown) — Elasticsearch plugin backend translates SQL to native ES Query DSL and handles aggregation mapping (inferred from POM and API paths)

⚖️Trade-offs already made

  • Deprecated plugin maintained as legacy support only

    • Why: Official Elasticsearch X-Pack SQL and OpenDistro for Elasticsearch SQL now provide native support; continuation as unmaintained fork reduces maintenance burden
    • Consequence: New features and ES version updates no longer actively developed; users encouraged to migrate to official solutions
  • Single-page application with client-side SQL parsing hints (not full validation)

    • Why: Minimal backend load and immediate UI feedback without round-trip latency; reduces server coupling
    • Consequence: Complex SQL validation deferred to Elasticsearch; users may experience delayed error feedback on malformed queries
  • Stateless HTTP POST to Elasticsearch /_nlpcn/sql endpoint

    • Why: Simplifies deployment (no session state) and aligns with Elasticsearch plugin architecture; no proxy or middleware needed
    • Consequence: Cannot maintain query history or saved queries without external database; multi-page workflows require client-side storage (localStorage)

🚫Non-goals (don't propose these)

  • No authentication or authorization layer (security delegated to Elasticsearch)
  • No persistent query history or saved query storage (client-side browser storage only)
  • No real-time query execution or streaming results (batch-only, poll-based)
  • Not a complete SQL dialect implementation (subset of SQL features compatible with ES DSL mapping)
  • No support for cross-cluster queries or federated Elasticsearch deployments
  • Not actively maintained after project deprecation notice (use official x-pack-sql or OpenDistro instead)

🪤Traps & gotchas

Version-specific branches required: ES 9.3.4 requires the elastic9.3.4 branch, not master. The deprecated path change (7.5.0.0+): use /_nlpcn/sql not /_sql. Plugin installation requires matching plugin classname org.elasticsearch.plugin.nlpcn.SqlPlug in Elasticsearch version. Maven property elasticsearch.version must match your ES cluster version or the plugin will fail to load. No obvious environment variables, but Java 8+ and Maven 3.x+ are required for builds.

🏗️Architecture

💡Concepts to learn

  • SQL to Query DSL compilation — The core function of this plugin is translating SQL SELECT/WHERE/GROUP BY clauses into Elasticsearch Query DSL (JSON)—understanding this mapping is essential to debug why a query doesn't work as expected.
  • Elasticsearch Aggregations — SQL GROUP BY and aggregate functions (SUM, COUNT, AVG) must be compiled to ES aggregations—a contributor needs to understand ES agg API to properly support SQL analytics.
  • Druid SQL Parser — This plugin uses Alibaba Druid's SQL parser (dependency druid 1.2.15) to build an AST—understanding Druid's parser output format is necessary to modify SQL feature support.
  • Elasticsearch Plugin Architecture — The code must implement org.elasticsearch.plugin.nlpcn.SqlPlug (declared in pom.xml) as an ES plugin—contributors need to know how ES plugins register REST endpoints and actions.
  • AngularJS 1.x MVC — The web UI (src/_site/index.html, controllers.js) uses AngularJS 1.x—frontend contributors must understand AngularJS scopes, controllers, and HTTP services.
  • Index Mapping Inference — mapping_parser.js infers SQL field types from ES index mappings—understanding how ES field types (keyword, text, numeric, nested) map to SQL types (VARCHAR, INT, STRUCT) is critical for correct query semantics.
  • elastic/elasticsearch — Official Elasticsearch repository—this plugin extends ES itself; understanding ES Query DSL is essential for understanding sql-to-DSL compilation in this plugin.
  • elastic/elasticsearch-xpack-sql — Official Elasticsearch SQL support (x-pack)—the recommended replacement for this deprecated plugin per the README notice.
  • opendistro-for-elasticsearch/sql — AWS OpenDistro SQL for Elasticsearch—alternative official SQL plugin (Apache 2.0) recommended in README.md as preferred over this deprecated project.
  • alibaba/druid — Alibaba Druid SQL parser (druid 1.2.15 in pom.xml)—the underlying SQL parsing engine that elasticsearch-sql wraps to translate SQL to ES Query DSL.

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for frontend SQL query parsing and result handling

The repository has JavaScript test files (src/_site/tests/spec/) but only covers ResultHandlers and TablePresenter. The core query building logic in src/_site/query.js and mapping parser in src/_site/mapping_parser.js lack test coverage. Given that this is a deprecated project still used by many, adding tests for SQL query construction and ES response parsing would prevent regressions and help new contributors understand the codebase.

  • [ ] Create unit tests in src/_site/tests/spec/QueryParserSpec.js for src/_site/query.js covering SQL to ES DSL conversion
  • [ ] Create unit tests in src/_site/tests/spec/MappingParserSpec.js for src/_site/mapping_parser.js covering schema mapping logic
  • [ ] Add test resources in src/_site/tests/resources/ for various SQL query types (joins, aggregations, filters) and expected ES query outputs
  • [ ] Update src/_site/tests/SpecRunner.html to include new test suites

Add GitHub Actions workflow for multi-version Elasticsearch compatibility testing

The repo has .travis.yml but the README shows builds for 3 different ES versions (6.8.23, 7.17.28, 8.19.15). The current .github/workflows/actions.yml likely doesn't test all versions. With the project deprecated but still in use, a comprehensive matrix build workflow would catch breaking changes and help contributors understand version compatibility requirements.

  • [ ] Create .github/workflows/multi-version-test.yml with a matrix strategy testing ES 6.8.23, 7.17.28, 8.19.15, and 9.3.4
  • [ ] Configure the workflow to run 'mvn clean test' for each version using the version-specific branches if needed
  • [ ] Add build status badges to README.md for each tested version
  • [ ] Document in BUILDING.md the procedure for testing against different ES versions locally

Document the /_nlpcn/sql endpoint deprecation migration path with examples

The README mentions the breaking change from /_sql to /_nlpcn/sql in version 7.5.0.0 but provides no migration guide or code examples. New contributors and users upgrading from older versions need clear documentation on request/response changes. This would reduce support burden for a deprecated project.

  • [ ] Create doc/migration-guide.md documenting the endpoint change and any request/response format differences between /_sql and /_nlpcn/sql
  • [ ] Add curl/code examples in doc/migration-guide.md showing before/after request syntax for common queries
  • [ ] Update README.md with a 'Migration' section linking to the migration guide for users upgrading from pre-7.5.0.0
  • [ ] Document any other breaking changes across major ES versions (6.x, 7.x, 8.x, 9.x) in the same migration guide

🌿Good first issues

  • Add unit tests for mapping_parser.js (src/_site/mapping_parser.js has no corresponding test file in src/_site/tests/spec/)—mapping parsing is critical for translating ES field types to SQL but lacks test coverage.
  • Document the SQL feature matrix: create a doc/SQL_CAPABILITIES.md listing exactly which SQL features are supported (SELECT, WHERE, GROUP BY, JOIN, UNION, etc.) since README.md lacks this detail despite wiki references.
  • Update the web UI to show query explain plans: src/_site/query.js makes requests to /_nlpcn/sql but there's no UI display for the explain endpoint (/_nlpcn/sql/explain mentioned in README)—add a toggle to show the compiled ES Query DSL.

Top contributors

Click to expand

📝Recent commits

Click to expand
  • f55dbfe — update README.md (shi-yuan)
  • c0b2a61 — es 9.3.4 support (shi-yuan)
  • cc4293c — es 9.3.3 support (shi-yuan)
  • b23b809 — es 9.3.2 support (shi-yuan)
  • 8839376 — es 9.3.1 support (shi-yuan)
  • 11abd61 — update README.md (shi-yuan)
  • 3d5cab3 — es 9.3.0 support (shi-yuan)
  • bc8fec8 — update README.md (shi-yuan)
  • fcb0743 — es 9.2.5 support (shi-yuan)
  • 0e194e6 — es 9.2.4 support (shi-yuan)

🔒Security observations

  • Critical · SQL Injection Risk - SQL Query Execution — src/ (core SQL parsing and execution logic). The project translates SQL queries directly to Elasticsearch queries. Without proper parameterization and input validation, user-supplied SQL could be manipulated to execute unintended Elasticsearch operations or access unauthorized data. Fix: Implement strict input validation, use parameterized queries, implement allowlist-based query validation, and apply principle of least privilege to Elasticsearch credentials.
  • High · XSS Vulnerability in Web UI — src/_site/controllers.js, src/_site/query.js, src/_site/table_presenter.js. The web UI files (controllers.js, query.js, table_presenter.js) process and display query results. Risk of Cross-Site Scripting if user-controlled data or Elasticsearch responses are rendered without proper sanitization. Fix: Use Angular's built-in sanitization, avoid dangerouslySetInnerHTML, use content security policies (CSP), validate and escape all user inputs and Elasticsearch responses before rendering.
  • High · Outdated Dependency - Druid 1.2.15 — pom.xml (druid.version property). Druid 1.2.15 (specified in pom.xml) contains known vulnerabilities. The druid-core library has had multiple security issues including potential SQL injection and deserialization attacks. Fix: Update to the latest stable version of Druid (1.2.23 or newer). Review release notes for security patches and test thoroughly before deployment.
  • High · Outdated Dependency - Guava 32.0.0-jre — pom.xml (guava.version property). Guava 32.0.0-jre is an old version that may contain known vulnerabilities. More recent versions include security improvements. Fix: Update to the latest Guava version (33.0.0 or newer). Review security advisories for any CVEs between current and target versions.
  • High · Deprecated Project - No Active Maintenance — README.md. The README explicitly states the project is deprecated and no longer in active development. This means security vulnerabilities will not receive patches or timely fixes. Fix: Migrate to officially supported alternatives: x-pack-sql (Elastic) or OpenDistro for Elasticsearch SQL (AWS). Do not use this project in production environments.
  • Medium · No Evidence of Authentication/Authorization Controls — API endpoint handlers (path: /_nlpcn/sql). The codebase appears to lack built-in authentication and authorization mechanisms. Direct access to /_nlpcn/sql endpoint could allow unauthorized query execution against Elasticsearch. Fix: Implement API-level authentication (API keys, OAuth2, mTLS), enforce authorization checks, use Elasticsearch security features, and implement rate limiting.
  • Medium · Frontend Dependencies in Vendor Directory — src/_site/vendor/ (AngularJS, Bootstrap, CodeMirror). Third-party frontend libraries (Angular, Bootstrap, CodeMirror) in vendor/ directory may contain vulnerabilities. No evidence of dependency scanning or updates. Fix: Implement automated dependency scanning, update all frontend libraries to latest secure versions, use npm/package manager instead of vendored copies, add supply chain security checks.
  • Medium · Lack of Input Validation Framework — src/ (core application logic). No visible evidence of comprehensive input validation library or framework for sanitizing SQL queries and user inputs. Fix: Implement strict input validation using a validation framework, use whitelisting for allowed query patterns, implement query complexity limits, and add comprehensive logging.
  • Low · Missing Security Configuration Documentation — README.md, BUILDING.md, doc/. README and documentation lack security hardening guidelines, security best practices, or deployment security checklist. Fix: Create comprehensive security documentation covering: secure deployment, authentication setup, network isolation, monitoring, incident response, and security checklist.
  • Low · No Evidence of Security Headers Configuration — undefined. Web UI files lack configuration for security headers (CSP, X-Frame-Options, HSTS, X- Fix: undefined

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Healthy signals · NLPchina/elasticsearch-sql — RepoPilot