manticoresoftware/manticoresearch

Item: manticoresoftware/manticoresearch
Rating: 5
Author: RepoPilot

Easy to use open source fast database for search | Good alternative to Elasticsearch | Drop-in replacement for E in the ELK stack

Healthy

Healthy across the board

worst of 4 axes

Use as dependencyConcerns

copyleft license (GPL-3.0) — review compatibility

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 1d ago
✓8 active contributors
✓Distributed ownership (top contributor 40% of recent commits)

Show 4 more →

✓GPL-3.0 licensed
✓CI configured
✓Tests present
⚠GPL-3.0 is copyleft — check downstream compatibility

What would change the summary?

→Use as dependency Concerns → Mixed if: relicense under MIT/Apache-2.0 (rare for established libs)

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/manticoresoftware/manticoresearch)](https://repopilot.app/r/manticoresoftware/manticoresearch)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/manticoresoftware/manticoresearch on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: manticoresoftware/manticoresearch

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/manticoresoftware/manticoresearch shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across the board

Last commit 1d ago
8 active contributors
Distributed ownership (top contributor 40% of recent commits)
GPL-3.0 licensed
CI configured
Tests present
⚠ GPL-3.0 is copyleft — check downstream compatibility

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live manticoresoftware/manticoresearch repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/manticoresoftware/manticoresearch.

What it runs against: a local clone of manticoresoftware/manticoresearch — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in manticoresoftware/manticoresearch | Confirms the artifact applies here, not a fork | | 2 | License is still GPL-3.0 | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 31 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>manticoresoftware/manticoresearch</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of manticoresoftware/manticoresearch. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/manticoresoftware/manticoresearch.git
#   cd manticoresearch
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of manticoresoftware/manticoresearch and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "manticoresoftware/manticoresearch(\\.git)?\\b" \\
  && ok "origin remote is manticoresoftware/manticoresearch" \\
  || miss "origin remote is not manticoresoftware/manticoresearch (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(GPL-3\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"GPL-3\\.0\"" package.json 2>/dev/null) \\
  && ok "license is GPL-3.0" \\
  || miss "license drift — was GPL-3.0 at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f ".github/workflows/test.yml" \\
  && ok ".github/workflows/test.yml" \\
  || miss "missing critical file: .github/workflows/test.yml"
test -f ".github/workflows/pack_publish.yml" \\
  && ok ".github/workflows/pack_publish.yml" \\
  || miss "missing critical file: .github/workflows/pack_publish.yml"
test -f ".gitmodules" \\
  && ok ".gitmodules" \\
  || miss "missing critical file: .gitmodules"
test -f ".github/workflows/nightly_integration.yml" \\
  && ok ".github/workflows/nightly_integration.yml" \\
  || miss "missing critical file: .github/workflows/nightly_integration.yml"
test -f ".github/workflows/coverage.yml" \\
  && ok ".github/workflows/coverage.yml" \\
  || miss "missing critical file: .github/workflows/coverage.yml"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 31 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~1d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/manticoresoftware/manticoresearch"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Manticore Search is an open-source full-text search database written primarily in C++ that serves as a drop-in replacement for Elasticsearch. It provides fast indexed search, real-time indexing, faceted search, and clustering capabilities via MySQL protocol and HTTP APIs, optimized for performance-critical applications that need search functionality without the overhead of traditional search infrastructure. Monolithic codebase organized by language layers: core search engine in C++ with Yacc/Lex lexers for query parsing, HTTP server layer, MySQL protocol compatibility shim, and polyglot client bindings (PHP, Ruby, Go, Python, Rust in separate modules). CMake-based build system (.cmake files) orchestrates compilation. The .github/workflows directory contains the testing and release pipeline, while .clt/ contains a custom testing framework.

👥Who it's for

DevOps engineers and backend developers deploying search infrastructure who want Elasticsearch compatibility with lower resource consumption; database architects evaluating search engines for ELK stack replacement; organizations running cost-sensitive search workloads that can't justify Elasticsearch's resource footprint.

🌱Maturity & risk

Actively maintained and production-ready: the codebase is substantial (~8.8M lines of C++), has comprehensive CI/CD via GitHub Actions workflows (test.yml, clt_tests.yml, nightly_integration.yml, coverage.yml), and includes extensive testing infrastructure (.clt/ directory with checkers and patterns). The repository shows continuous development with multiple nightly test workflows indicating active stabilization work.

Moderate risk: this is a complex C++ codebase where memory management bugs could impact stability, and the MySQL protocol compatibility layer means protocol version mismatches could cause client issues. However, the presence of dedicated memory leak testing (nightly_memleaks.yml) and fuzzing (nightly_fuzzer.yml) mitigates some risk. The large language diversity (8 major languages including C++, PHP, Yacc, Lex) suggests polyglot development that increases onboarding friction.

Active areas of work

Active development across multiple domains: recent workflows include cluster recovery testing (nightly_integration.yml), memory leak detection (nightly_memleaks.yml), and documentation validation (check_docs.yml). The presence of clt_nightly.yml and multiple template workflows (build_template.yml, test_template.yml) suggests ongoing test infrastructure refactoring. Galera cluster support is under active development (pack_publish_galera.yml).

🚀Get running

git clone https://github.com/manticoresoftware/manticoresearch.git
cd manticoresearch
mkdir build && cd build
cmake ..
make
sudo make install
# or via Docker
docker run -it manticoresearch/manticore

Daily commands:

# After building (see howDoIStart):
searchd  # runs the main daemon
# Connect via MySQL:
mysql -h127.0.0.1 -P9306
# or via HTTP:
curl http://127.0.0.1:9308/search

🗺️Map of the codebase

.github/workflows/test.yml — Primary CI/CD pipeline for running tests across all platforms; defines how code quality and functionality are validated before merge
.github/workflows/pack_publish.yml — Orchestrates binary packaging and distribution; critical for release engineering and deployment to end users
.gitmodules — Defines external dependencies and submodule integrations; essential for understanding the full dependency graph
.github/workflows/nightly_integration.yml — Integration test suite running daily; reveals stability issues and regressions in complex scenarios
.github/workflows/coverage.yml — Code coverage measurement and reporting; tracks test coverage gaps and code quality trends
.cursorignore — Defines which files to exclude from IDE indexing; helps understand large-scale code organization and vendor dependencies

🧩Components & responsibilities

searchd (Daemon Server) (C++, custom network protocol, Galera for replication) — Main search server accepting queries over HTTP and MySQL protocol; manages indexes, executions, and cluster coordination
- Failure mode: If searchd crashes, all queries fail; cluster operations stall until node recovery
Index Storage Layer — Real-time memory indexes and file-based plain indexes; handles tokenization, morphology, and

🛠️How to make changes

Add a New Integration Test Case

Create test file in CLT format under .clt/ directory with test name and expected output (.clt/checkers/your_test_name)
Add test pattern matching logic if needed in .clt/patterns/ (.clt/patterns/your_pattern)
Trigger test execution by updating .github/workflows/clt_tests.yml if your test requires special conditions (.github/workflows/clt_tests.yml)

Add a New Release Artifact

Define build matrix and artifact packaging steps in the main release workflow (.github/workflows/pack_publish.yml)
Add platform-specific build logic (e.g., for Windows add Windows runner step) (.github/workflows/test_template.yml)
Update checklist before publishing in release validation workflow (.github/workflows/checklist_validator.yml)

Add Documentation in a New Language

Create translated markdown files following structure in .translation-cache/ (.translation-cache/Changelog.md.json)
Update documentation deployment to include new language builds (.github/workflows/deploy_docs.yml)
Add language validation rules to documentation checker (.github/workflows/check_docs.yml)

Add a New Performance Test

Create nightly performance test workflow for memory, CPU, or latency profiling (.github/workflows/nightly_memleaks.yml)
Set up baseline metrics and regression detection thresholds (.github/workflows/nightly_integration.yml)

🔧Why these technologies

GitHub Actions Workflows — Provides native CI/CD integration without external tools; enables matrix testing across Linux, Windows, and macOS from single configuration
Custom Language Test (CLT) Framework — Domain-specific testing for SQL query language and protocol validation; allows non-engineers to write functional tests
Multi-Platform Release Pipeline — Manticore must run on Linux (primary), Windows, and macOS; separate workflows ensure platform-specific optimizations and dependencies
Nightly Integration & Fuzzing — Search database requires exhaustive testing for race conditions, memory safety, and protocol edge cases; nightly runs catch regressions early
Documentation-as-Code with Translations — Multi-language support via translation cache JSON; automated deployment ensures docs stay synchronized with releases

⚖️Trade-offs already made

Separate test.yml and test_template.yml workflows
- Why: Reduces duplication and allows reuse across different CI jobs (e.g., Windows uses its own template)
- Consequence: Contributors must understand workflow inheritance; harder to modify all tests in one place
Git submodules (.gitmodules) rather than package manager vendoring
- Why: Tight coupling with exact commit versions ensures reproducible builds and offline builds
- Consequence: Submodule cloning adds setup complexity; developers must run git submodule update --init
Nightly vs. PR-blocking tests
- Why: Integration, fuzzing, and memcheck are expensive; keeping PR-blocking tests fast prevents developer friction
- Consequence: Bugs may slip into main branch if nightly tests aren't monitored; requires separate bug triage process
Translation cache as JSON snapshots
- Why: Reduces external API calls during builds; cached translations ensure offline documentation generation
- Consequence: Translation updates require manual cache regeneration; stale translations possible if not refreshed regularly

🚫Non-goals (don't propose these)

Language-specific client library packaging (focus is on server)
GUI administration console (CLI and HTTP API only)
Real-time analytics dashboards (search-first design)
Multi-tenancy isolation (single-tenant or cluster-wide permissions)
Windows-native installer with GUI (server-only, not a traditional application)

🪤Traps & gotchas

Yacc/Lex regeneration: modifying .y or .l files requires running yacc/lex tools manually; CMake may not auto-regenerate lexer/parser artifacts in all cases. MySQL protocol quirks: compatibility is best-effort; some edge-case MySQL features may not work identically. Memory management: C++ codebase requires careful handling of allocation/deallocation; no garbage collection means potential leaks if not tested via nightly_memleaks.yml workflow. CLT test language: the .clt/ testing framework has custom syntax in checkers/ and patterns/ that's not documented inline. Build artifacts: compiled binaries should be tested against the same libc/compiler version they'll run on; cross-compilation can fail silently.

🏗️Architecture

💡Concepts to learn

Inverted Index — Core data structure that Manticore uses for full-text search; understanding how terms map to documents is essential for query optimization and indexing strategy
MySQL Protocol Compatibility Layer — Manticore's unique value prop is speaking MySQL wire protocol; understanding this compatibility layer is crucial for debugging client connection issues and extending protocol support
Yacc/Lex Parsing — Query parsing in Manticore uses traditional compiler construction tools; modifications to SQL syntax or query features require understanding LR parsing and lexical analysis
Real-Time Indexing — Manticore's 'real-time' index type allows index updates without rebuilding; this differentiates it from batch-only search engines and is critical to understand for data consistency
Faceted Search / Facets — Built-in faceting support (aggregations by field) is a core feature differentiating Manticore from simple text search; used in UI filtering and analytics
Galera Replication Cluster — Manticore's clustering uses Galera for synchronous multi-master replication; understanding this is essential for production deployments and failover handling
Memory-Mapped I/O — Manticore uses mmap for efficient file access to large indexes; understanding this pattern explains performance characteristics and memory footprint

elastic/elasticsearch — Direct competitor and the primary target for drop-in replacement; Manticore aims for API compatibility with this
opensearch-project/opensearch — Elasticsearch fork that emerged after licensing change; represents alternative in the same search engine ecosystem
manticoresoftware/docker — Official Docker images for Manticore Search; required for containerized deployments and testing
manticoresoftware/manticoresearch-php — Dedicated PHP client library separate from this monorepo; most popular language binding for Manticore
sphinx-doc/sphinx — Manticore is a fork of Sphinx Search; understanding Sphinx history and divergence points informs design decisions

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive test coverage for CLT (Command Line Tool) checkers

The .clt/checkers directory contains multiple checker modules (contains, ignore, something) but there's no visible test suite for validating their functionality. Given the complexity of pattern matching and validation logic, adding unit tests would prevent regressions and make the checker system more maintainable. This is critical since CLT appears to be used in the checklist_validator.yml workflow.

[ ] Create tests/clt/checkers/ directory structure mirroring .clt/checkers/
[ ] Add unit tests for .clt/checkers/contains logic
[ ] Add unit tests for .clt/checkers/ignore logic
[ ] Add unit tests for .clt/checkers/something logic
[ ] Add test fixtures in tests/clt/patterns/ matching .clt/patterns/
[ ] Integrate test execution into .github/workflows/checklist_validator.yml

Create GitHub Action workflow for translation cache validation

The .translation-cache/ directory contains numerous JSON translation files with a nested structure, but there's no visible CI workflow to validate cache integrity, detect stale translations, or ensure consistency across documentation. This would prevent broken translations in documentation deployments and align with existing workflows like check_docs.yml and deploy_docs.yml.

[ ] Create .github/workflows/validate_translation_cache.yml
[ ] Add schema validation for all .json files in .translation-cache/
[ ] Add checks to ensure translation files correspond to source documentation
[ ] Add validation that nested directory structure in .translation-cache/ matches documentation hierarchy
[ ] Integrate workflow to trigger on changes to .translation-cache/ or documentation files
[ ] Add job to report missing/stale translation files

Add missing test coverage for pattern matching system

The .clt/patterns/ directory is referenced in the file structure but appears to contain pattern definitions used by the CLT checkers. There's likely no dedicated test suite for pattern matching logic. Given that clt_tests.yml and clt_nightly.yml workflows exist, formalized pattern tests would improve reliability of the validation system.

[ ] Create tests/clt/patterns/test_pattern_matching.ts (or appropriate language)
[ ] Add test cases for pattern matching against common documentation scenarios
[ ] Add test cases for edge cases (empty patterns, special characters, unicode)
[ ] Create test fixtures in tests/clt/patterns/fixtures/
[ ] Add performance benchmarks for pattern matching against .translation-cache/ scale
[ ] Integrate pattern tests into .github/workflows/clt_tests.yml

🌿Good first issues

Add unit tests for src/http/ endpoints that currently lack coverage; check coverage.yml results to identify untested paths in the HTTP API layer
Document the CLT (Command Line Testing) framework syntax by extracting comments from .clt/checkers/ and .clt/patterns/ into a TESTING.md file
Implement missing protocol validation: audit src/mysql_protocol/ for SQL keywords that Elasticsearch supports but Manticore doesn't, then add parser tests and documentation of limitations

⭐Top contributors

Click to expand

@klirichek — 40 commits
@tomatolog — 18 commits
@sanikolaev — 16 commits
@githubmanticore — 11 commits
@Nick-S-2018 — 7 commits

📝Recent commits

Click to expand

d1fcc38 — strict types conversion from json to mysql (klirichek)
1663b9f — Bump buddy version from 3.46.0 to 3.46.1 (githubmanticore)
2b73b58 — 🆕 Update buddy version from 3.45.0 to 3.46.0 (#4558) (githubmanticore)
2815e24 — 🆕 Update buddy version from 3.44.5 to 3.45.0 (#4555) (githubmanticore)
940d9cc — ci: added clt-test for opensearch-dashboards (#4530) (Nick-S-2018)
d32f766 — Merge pull request #4552 from manticoresoftware/prepared_buddy (klirichek)
35ae5b2 — clt test for prepared fuzzy (klirichek)
d362cc1 — fix: return bool, multi and json values (klirichek)
dbaf3f2 — fix: process prepared stmts with buddy (klirichek)
f32c773 — Bump manticore-load version from 1.24.0 to 1.25.0 (githubmanticore)

🔒Security observations

The Manticore Search repository demonstrates moderate security practices with several positive signals: active use of CI/CD workflows, nightly security testing (fuzzer, integration tests, memory leak detection), and organized GitHub issue templates. However, there are notable gaps: (1) No visible SECURITY.md for vulnerability disclosure, (2) Multiple CI/CD workflows requiring individual security review, (3) Potential sensitive data in .translation-cache, (4) Lack of visible code signing requirements. The codebase appears to be a mature database project but would benefit from: documented security policies, formal disclosure procedures, stricter CI/CD configurations, and explicit dependency management practices. The absence of provided dependency files limits assessment of supply chain security.

Medium · Potential Sensitive Data in Translation Cache — .translation-cache/. The .translation-cache directory contains JSON files that may include sensitive documentation or configuration data. If this cache is committed to version control, it could expose internal documentation or configuration details. Fix: Add .translation-cache/ to .gitignore and ensure sensitive documentation is not included in version-controlled cache files. Consider using environment-specific caching.
Medium · Multiple CI/CD Workflow Files with Potential Security Implications — .github/workflows/. The codebase contains numerous GitHub Actions workflows (.github/workflows/*.yml) that may execute arbitrary code, build binaries, or publish packages. Without reviewing individual workflow configurations, potential risks include: insecure secret handling, privilege escalation in CI/CD, or compromised dependencies during build. Fix: Review all workflow files for: (1) Proper secret handling and environment variable isolation, (2) Pinned action versions to prevent supply chain attacks, (3) Least-privilege permissions for CI/CD jobs, (4) Secrets rotation policies, (5) Approval requirements for publishing/deployment workflows.
Medium · Stale Issue Automation Configuration — .github/stale.yml. The .github/stale.yml file suggests automated issue closure. If misconfigured, this could lead to legitimate security reports being closed without proper review. Fix: Ensure security-related labels are excluded from stale issue automation. Implement manual review requirements for issues labeled as 'security'.
Low · Missing Security.md or Security Policy — Repository root. No visible SECURITY.md file or security policy documentation in the provided file structure, which is a best practice for open-source projects to define vulnerability disclosure procedures. Fix: Create a SECURITY.md file with clear vulnerability disclosure procedures, responsible disclosure policy, and security contact information. GitHub also supports a .github/SECURITY.md file.
Low · Lack of Code Signing Configuration — Repository configuration. No visible .gitconfig or signing requirements detected. Without code signing, commits could be spoofed, and unsigned releases could be tampered with. Fix: Implement GPG code signing requirements for commits. Configure release signing for published binaries and containers. Document signing key verification procedures.
Low · Potential Fuzzing Coverage Gaps — .github/workflows/nightly_fuzzer.yml. While nightly_fuzzer.yml exists, the scope of fuzzing coverage is unknown. A search database parsing complex queries could have significant fuzzing requirements. Fix: Ensure fuzzing targets all query parsers, SQL interpreters, and input parsing logic. Use coverage-guided fuzzing. Consider integrating OSS-Fuzz or libFuzzer.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

manticoresoftware/manticoresearch

Embed the "Healthy" badge

Onboarding doc

Onboarding: manticoresoftware/manticoresearch

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🧩Components & responsibilities

🛠️How to make changes

Add a New Integration Test Case

Add a New Release Artifact

Add Documentation in a New Language

Add a New Performance Test

🔧Why these technologies

⚖️Trade-offs already made

🚫Non-goals (don't propose these)

🪤Traps & gotchas

🏗️Architecture

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add comprehensive test coverage for CLT (Command Line Tool) checkers

Create GitHub Action workflow for translation cache validation

Add missing test coverage for pattern matching system

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next