manticoresoftware/manticoresearch
Easy to use open source fast database for search | Good alternative to Elasticsearch | Drop-in replacement for E in the ELK stack
Healthy across the board
worst of 4 axescopyleft license (GPL-3.0) — review compatibility
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 1d ago
- ✓8 active contributors
- ✓Distributed ownership (top contributor 40% of recent commits)
Show 4 more →Show less
- ✓GPL-3.0 licensed
- ✓CI configured
- ✓Tests present
- ⚠GPL-3.0 is copyleft — check downstream compatibility
What would change the summary?
- →Use as dependency Concerns → Mixed if: relicense under MIT/Apache-2.0 (rare for established libs)
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/manticoresoftware/manticoresearch)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/manticoresoftware/manticoresearch on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: manticoresoftware/manticoresearch
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/manticoresoftware/manticoresearch shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across the board
- Last commit 1d ago
- 8 active contributors
- Distributed ownership (top contributor 40% of recent commits)
- GPL-3.0 licensed
- CI configured
- Tests present
- ⚠ GPL-3.0 is copyleft — check downstream compatibility
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live manticoresoftware/manticoresearch
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/manticoresoftware/manticoresearch.
What it runs against: a local clone of manticoresoftware/manticoresearch — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in manticoresoftware/manticoresearch | Confirms the artifact applies here, not a fork |
| 2 | License is still GPL-3.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 31 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of manticoresoftware/manticoresearch. If you don't
# have one yet, run these first:
#
# git clone https://github.com/manticoresoftware/manticoresearch.git
# cd manticoresearch
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of manticoresoftware/manticoresearch and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "manticoresoftware/manticoresearch(\\.git)?\\b" \\
&& ok "origin remote is manticoresoftware/manticoresearch" \\
|| miss "origin remote is not manticoresoftware/manticoresearch (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(GPL-3\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"GPL-3\\.0\"" package.json 2>/dev/null) \\
&& ok "license is GPL-3.0" \\
|| miss "license drift — was GPL-3.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f ".github/workflows/test.yml" \\
&& ok ".github/workflows/test.yml" \\
|| miss "missing critical file: .github/workflows/test.yml"
test -f ".github/workflows/pack_publish.yml" \\
&& ok ".github/workflows/pack_publish.yml" \\
|| miss "missing critical file: .github/workflows/pack_publish.yml"
test -f ".gitmodules" \\
&& ok ".gitmodules" \\
|| miss "missing critical file: .gitmodules"
test -f ".github/workflows/nightly_integration.yml" \\
&& ok ".github/workflows/nightly_integration.yml" \\
|| miss "missing critical file: .github/workflows/nightly_integration.yml"
test -f ".github/workflows/coverage.yml" \\
&& ok ".github/workflows/coverage.yml" \\
|| miss "missing critical file: .github/workflows/coverage.yml"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 31 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~1d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/manticoresoftware/manticoresearch"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
Manticore Search is an open-source full-text search database written primarily in C++ that serves as a drop-in replacement for Elasticsearch. It provides fast indexed search, real-time indexing, faceted search, and clustering capabilities via MySQL protocol and HTTP APIs, optimized for performance-critical applications that need search functionality without the overhead of traditional search infrastructure. Monolithic codebase organized by language layers: core search engine in C++ with Yacc/Lex lexers for query parsing, HTTP server layer, MySQL protocol compatibility shim, and polyglot client bindings (PHP, Ruby, Go, Python, Rust in separate modules). CMake-based build system (.cmake files) orchestrates compilation. The .github/workflows directory contains the testing and release pipeline, while .clt/ contains a custom testing framework.
👥Who it's for
DevOps engineers and backend developers deploying search infrastructure who want Elasticsearch compatibility with lower resource consumption; database architects evaluating search engines for ELK stack replacement; organizations running cost-sensitive search workloads that can't justify Elasticsearch's resource footprint.
🌱Maturity & risk
Actively maintained and production-ready: the codebase is substantial (~8.8M lines of C++), has comprehensive CI/CD via GitHub Actions workflows (test.yml, clt_tests.yml, nightly_integration.yml, coverage.yml), and includes extensive testing infrastructure (.clt/ directory with checkers and patterns). The repository shows continuous development with multiple nightly test workflows indicating active stabilization work.
Moderate risk: this is a complex C++ codebase where memory management bugs could impact stability, and the MySQL protocol compatibility layer means protocol version mismatches could cause client issues. However, the presence of dedicated memory leak testing (nightly_memleaks.yml) and fuzzing (nightly_fuzzer.yml) mitigates some risk. The large language diversity (8 major languages including C++, PHP, Yacc, Lex) suggests polyglot development that increases onboarding friction.
Active areas of work
Active development across multiple domains: recent workflows include cluster recovery testing (nightly_integration.yml), memory leak detection (nightly_memleaks.yml), and documentation validation (check_docs.yml). The presence of clt_nightly.yml and multiple template workflows (build_template.yml, test_template.yml) suggests ongoing test infrastructure refactoring. Galera cluster support is under active development (pack_publish_galera.yml).
🚀Get running
git clone https://github.com/manticoresoftware/manticoresearch.git
cd manticoresearch
mkdir build && cd build
cmake ..
make
sudo make install
# or via Docker
docker run -it manticoresearch/manticore
Daily commands:
# After building (see howDoIStart):
searchd # runs the main daemon
# Connect via MySQL:
mysql -h127.0.0.1 -P9306
# or via HTTP:
curl http://127.0.0.1:9308/search
🗺️Map of the codebase
.github/workflows/test.yml— Primary CI/CD pipeline for running tests across all platforms; defines how code quality and functionality are validated before merge.github/workflows/pack_publish.yml— Orchestrates binary packaging and distribution; critical for release engineering and deployment to end users.gitmodules— Defines external dependencies and submodule integrations; essential for understanding the full dependency graph.github/workflows/nightly_integration.yml— Integration test suite running daily; reveals stability issues and regressions in complex scenarios.github/workflows/coverage.yml— Code coverage measurement and reporting; tracks test coverage gaps and code quality trends.cursorignore— Defines which files to exclude from IDE indexing; helps understand large-scale code organization and vendor dependencies
🧩Components & responsibilities
- searchd (Daemon Server) (C++, custom network protocol, Galera for replication) — Main search server accepting queries over HTTP and MySQL protocol; manages indexes, executions, and cluster coordination
- Failure mode: If searchd crashes, all queries fail; cluster operations stall until node recovery
- Index Storage Layer — Real-time memory indexes and file-based plain indexes; handles tokenization, morphology, and
🛠️How to make changes
Add a New Integration Test Case
- Create test file in CLT format under
.clt/directory with test name and expected output (.clt/checkers/your_test_name) - Add test pattern matching logic if needed in
.clt/patterns/(.clt/patterns/your_pattern) - Trigger test execution by updating
.github/workflows/clt_tests.ymlif your test requires special conditions (.github/workflows/clt_tests.yml)
Add a New Release Artifact
- Define build matrix and artifact packaging steps in the main release workflow (
.github/workflows/pack_publish.yml) - Add platform-specific build logic (e.g., for Windows add Windows runner step) (
.github/workflows/test_template.yml) - Update checklist before publishing in release validation workflow (
.github/workflows/checklist_validator.yml)
Add Documentation in a New Language
- Create translated markdown files following structure in
.translation-cache/(.translation-cache/Changelog.md.json) - Update documentation deployment to include new language builds (
.github/workflows/deploy_docs.yml) - Add language validation rules to documentation checker (
.github/workflows/check_docs.yml)
Add a New Performance Test
- Create nightly performance test workflow for memory, CPU, or latency profiling (
.github/workflows/nightly_memleaks.yml) - Set up baseline metrics and regression detection thresholds (
.github/workflows/nightly_integration.yml)
🔧Why these technologies
- GitHub Actions Workflows — Provides native CI/CD integration without external tools; enables matrix testing across Linux, Windows, and macOS from single configuration
- Custom Language Test (CLT) Framework — Domain-specific testing for SQL query language and protocol validation; allows non-engineers to write functional tests
- Multi-Platform Release Pipeline — Manticore must run on Linux (primary), Windows, and macOS; separate workflows ensure platform-specific optimizations and dependencies
- Nightly Integration & Fuzzing — Search database requires exhaustive testing for race conditions, memory safety, and protocol edge cases; nightly runs catch regressions early
- Documentation-as-Code with Translations — Multi-language support via translation cache JSON; automated deployment ensures docs stay synchronized with releases
⚖️Trade-offs already made
-
Separate test.yml and test_template.yml workflows
- Why: Reduces duplication and allows reuse across different CI jobs (e.g., Windows uses its own template)
- Consequence: Contributors must understand workflow inheritance; harder to modify all tests in one place
-
Git submodules (
.gitmodules) rather than package manager vendoring- Why: Tight coupling with exact commit versions ensures reproducible builds and offline builds
- Consequence: Submodule cloning adds setup complexity; developers must run
git submodule update --init
-
Nightly vs. PR-blocking tests
- Why: Integration, fuzzing, and memcheck are expensive; keeping PR-blocking tests fast prevents developer friction
- Consequence: Bugs may slip into main branch if nightly tests aren't monitored; requires separate bug triage process
-
Translation cache as JSON snapshots
- Why: Reduces external API calls during builds; cached translations ensure offline documentation generation
- Consequence: Translation updates require manual cache regeneration; stale translations possible if not refreshed regularly
🚫Non-goals (don't propose these)
- Language-specific client library packaging (focus is on server)
- GUI administration console (CLI and HTTP API only)
- Real-time analytics dashboards (search-first design)
- Multi-tenancy isolation (single-tenant or cluster-wide permissions)
- Windows-native installer with GUI (server-only, not a traditional application)
🪤Traps & gotchas
Yacc/Lex regeneration: modifying .y or .l files requires running yacc/lex tools manually; CMake may not auto-regenerate lexer/parser artifacts in all cases. MySQL protocol quirks: compatibility is best-effort; some edge-case MySQL features may not work identically. Memory management: C++ codebase requires careful handling of allocation/deallocation; no garbage collection means potential leaks if not tested via nightly_memleaks.yml workflow. CLT test language: the .clt/ testing framework has custom syntax in checkers/ and patterns/ that's not documented inline. Build artifacts: compiled binaries should be tested against the same libc/compiler version they'll run on; cross-compilation can fail silently.
🏗️Architecture
💡Concepts to learn
- Inverted Index — Core data structure that Manticore uses for full-text search; understanding how terms map to documents is essential for query optimization and indexing strategy
- MySQL Protocol Compatibility Layer — Manticore's unique value prop is speaking MySQL wire protocol; understanding this compatibility layer is crucial for debugging client connection issues and extending protocol support
- Yacc/Lex Parsing — Query parsing in Manticore uses traditional compiler construction tools; modifications to SQL syntax or query features require understanding LR parsing and lexical analysis
- Real-Time Indexing — Manticore's 'real-time' index type allows index updates without rebuilding; this differentiates it from batch-only search engines and is critical to understand for data consistency
- Faceted Search / Facets — Built-in faceting support (aggregations by field) is a core feature differentiating Manticore from simple text search; used in UI filtering and analytics
- Galera Replication Cluster — Manticore's clustering uses Galera for synchronous multi-master replication; understanding this is essential for production deployments and failover handling
- Memory-Mapped I/O — Manticore uses mmap for efficient file access to large indexes; understanding this pattern explains performance characteristics and memory footprint
🔗Related repos
elastic/elasticsearch— Direct competitor and the primary target for drop-in replacement; Manticore aims for API compatibility with thisopensearch-project/opensearch— Elasticsearch fork that emerged after licensing change; represents alternative in the same search engine ecosystemmanticoresoftware/docker— Official Docker images for Manticore Search; required for containerized deployments and testingmanticoresoftware/manticoresearch-php— Dedicated PHP client library separate from this monorepo; most popular language binding for Manticoresphinx-doc/sphinx— Manticore is a fork of Sphinx Search; understanding Sphinx history and divergence points informs design decisions
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add comprehensive test coverage for CLT (Command Line Tool) checkers
The .clt/checkers directory contains multiple checker modules (contains, ignore, something) but there's no visible test suite for validating their functionality. Given the complexity of pattern matching and validation logic, adding unit tests would prevent regressions and make the checker system more maintainable. This is critical since CLT appears to be used in the checklist_validator.yml workflow.
- [ ] Create tests/clt/checkers/ directory structure mirroring .clt/checkers/
- [ ] Add unit tests for .clt/checkers/contains logic
- [ ] Add unit tests for .clt/checkers/ignore logic
- [ ] Add unit tests for .clt/checkers/something logic
- [ ] Add test fixtures in tests/clt/patterns/ matching .clt/patterns/
- [ ] Integrate test execution into .github/workflows/checklist_validator.yml
Create GitHub Action workflow for translation cache validation
The .translation-cache/ directory contains numerous JSON translation files with a nested structure, but there's no visible CI workflow to validate cache integrity, detect stale translations, or ensure consistency across documentation. This would prevent broken translations in documentation deployments and align with existing workflows like check_docs.yml and deploy_docs.yml.
- [ ] Create .github/workflows/validate_translation_cache.yml
- [ ] Add schema validation for all .json files in .translation-cache/
- [ ] Add checks to ensure translation files correspond to source documentation
- [ ] Add validation that nested directory structure in .translation-cache/ matches documentation hierarchy
- [ ] Integrate workflow to trigger on changes to .translation-cache/ or documentation files
- [ ] Add job to report missing/stale translation files
Add missing test coverage for pattern matching system
The .clt/patterns/ directory is referenced in the file structure but appears to contain pattern definitions used by the CLT checkers. There's likely no dedicated test suite for pattern matching logic. Given that clt_tests.yml and clt_nightly.yml workflows exist, formalized pattern tests would improve reliability of the validation system.
- [ ] Create tests/clt/patterns/test_pattern_matching.ts (or appropriate language)
- [ ] Add test cases for pattern matching against common documentation scenarios
- [ ] Add test cases for edge cases (empty patterns, special characters, unicode)
- [ ] Create test fixtures in tests/clt/patterns/fixtures/
- [ ] Add performance benchmarks for pattern matching against .translation-cache/ scale
- [ ] Integrate pattern tests into .github/workflows/clt_tests.yml
🌿Good first issues
- Add unit tests for src/http/ endpoints that currently lack coverage; check coverage.yml results to identify untested paths in the HTTP API layer
- Document the CLT (Command Line Testing) framework syntax by extracting comments from .clt/checkers/ and .clt/patterns/ into a TESTING.md file
- Implement missing protocol validation: audit src/mysql_protocol/ for SQL keywords that Elasticsearch supports but Manticore doesn't, then add parser tests and documentation of limitations
⭐Top contributors
Click to expand
Top contributors
- @klirichek — 40 commits
- @tomatolog — 18 commits
- @sanikolaev — 16 commits
- @githubmanticore — 11 commits
- @Nick-S-2018 — 7 commits
📝Recent commits
Click to expand
Recent commits
d1fcc38— strict types conversion from json to mysql (klirichek)1663b9f— Bump buddy version from 3.46.0 to 3.46.1 (githubmanticore)2b73b58— 🆕 Update buddy version from 3.45.0 to 3.46.0 (#4558) (githubmanticore)2815e24— 🆕 Update buddy version from 3.44.5 to 3.45.0 (#4555) (githubmanticore)940d9cc— ci: added clt-test for opensearch-dashboards (#4530) (Nick-S-2018)d32f766— Merge pull request #4552 from manticoresoftware/prepared_buddy (klirichek)35ae5b2— clt test for prepared fuzzy (klirichek)d362cc1— fix: return bool, multi and json values (klirichek)dbaf3f2— fix: process prepared stmts with buddy (klirichek)f32c773— Bump manticore-load version from 1.24.0 to 1.25.0 (githubmanticore)
🔒Security observations
The Manticore Search repository demonstrates moderate security practices with several positive signals: active use of CI/CD workflows, nightly security testing (fuzzer, integration tests, memory leak detection), and organized GitHub issue templates. However, there are notable gaps: (1) No visible SECURITY.md for vulnerability disclosure, (2) Multiple CI/CD workflows requiring individual security review, (3) Potential sensitive data in .translation-cache, (4) Lack of visible code signing requirements. The codebase appears to be a mature database project but would benefit from: documented security policies, formal disclosure procedures, stricter CI/CD configurations, and explicit dependency management practices. The absence of provided dependency files limits assessment of supply chain security.
- Medium · Potential Sensitive Data in Translation Cache —
.translation-cache/. The .translation-cache directory contains JSON files that may include sensitive documentation or configuration data. If this cache is committed to version control, it could expose internal documentation or configuration details. Fix: Add .translation-cache/ to .gitignore and ensure sensitive documentation is not included in version-controlled cache files. Consider using environment-specific caching. - Medium · Multiple CI/CD Workflow Files with Potential Security Implications —
.github/workflows/. The codebase contains numerous GitHub Actions workflows (.github/workflows/*.yml) that may execute arbitrary code, build binaries, or publish packages. Without reviewing individual workflow configurations, potential risks include: insecure secret handling, privilege escalation in CI/CD, or compromised dependencies during build. Fix: Review all workflow files for: (1) Proper secret handling and environment variable isolation, (2) Pinned action versions to prevent supply chain attacks, (3) Least-privilege permissions for CI/CD jobs, (4) Secrets rotation policies, (5) Approval requirements for publishing/deployment workflows. - Medium · Stale Issue Automation Configuration —
.github/stale.yml. The .github/stale.yml file suggests automated issue closure. If misconfigured, this could lead to legitimate security reports being closed without proper review. Fix: Ensure security-related labels are excluded from stale issue automation. Implement manual review requirements for issues labeled as 'security'. - Low · Missing Security.md or Security Policy —
Repository root. No visible SECURITY.md file or security policy documentation in the provided file structure, which is a best practice for open-source projects to define vulnerability disclosure procedures. Fix: Create a SECURITY.md file with clear vulnerability disclosure procedures, responsible disclosure policy, and security contact information. GitHub also supports a .github/SECURITY.md file. - Low · Lack of Code Signing Configuration —
Repository configuration. No visible .gitconfig or signing requirements detected. Without code signing, commits could be spoofed, and unsigned releases could be tampered with. Fix: Implement GPG code signing requirements for commits. Configure release signing for published binaries and containers. Document signing key verification procedures. - Low · Potential Fuzzing Coverage Gaps —
.github/workflows/nightly_fuzzer.yml. While nightly_fuzzer.yml exists, the scope of fuzzing coverage is unknown. A search database parsing complex queries could have significant fuzzing requirements. Fix: Ensure fuzzing targets all query parsers, SQL interpreters, and input parsing logic. Use coverage-guided fuzzing. Consider integrating OSS-Fuzz or libFuzzer.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.