zilliztech/claude-context

Item: zilliztech/claude-context
Rating: 5
Author: RepoPilot

Code search MCP for Claude Code. Make entire codebase the context for any coding agent.

Healthy

Healthy across the board

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit today
✓20 active contributors
✓MIT licensed
✓CI configured
✓Tests present
⚠Concentrated ownership — top contributor handles 58% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the “Healthy” badge

Paste into your README — live-updates from the latest cached analysis.

[![RepoPilot: Healthy](https://repopilot.app/api/badge/zilliztech/claude-context)](https://repopilot.app/r/zilliztech/claude-context)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/zilliztech/claude-context on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: zilliztech/claude-context

Generated by RepoPilot · 2026-05-06 · Source

Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/zilliztech/claude-context shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

Verdict

GO — Healthy across the board

Last commit today
20 active contributors
MIT licensed
CI configured
Tests present
⚠ Concentrated ownership — top contributor handles 58% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live zilliztech/claude-context repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/zilliztech/claude-context.

What it runs against: a local clone of zilliztech/claude-context — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in zilliztech/claude-context | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 30 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>zilliztech/claude-context</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of zilliztech/claude-context. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/zilliztech/claude-context.git
#   cd claude-context
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of zilliztech/claude-context and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "zilliztech/claude-context(\\.git)?\\b" \\
  && ok "origin remote is zilliztech/claude-context" \\
  || miss "origin remote is not zilliztech/claude-context (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "packages/core/README.md" \\
  && ok "packages/core/README.md" \\
  || miss "missing critical file: packages/core/README.md"
test -f "packages/chrome-extension/src/background.ts" \\
  && ok "packages/chrome-extension/src/background.ts" \\
  || miss "missing critical file: packages/chrome-extension/src/background.ts"
test -f "evaluation/retrieval/custom.py" \\
  && ok "evaluation/retrieval/custom.py" \\
  || miss "missing critical file: evaluation/retrieval/custom.py"
test -f "docs/dive-deep/asynchronous-indexing-workflow.md" \\
  && ok "docs/dive-deep/asynchronous-indexing-workflow.md" \\
  || miss "missing critical file: docs/dive-deep/asynchronous-indexing-workflow.md"
test -f "examples/basic-usage/index.ts" \\
  && ok "examples/basic-usage/index.ts" \\
  || miss "missing critical file: examples/basic-usage/index.ts"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 30 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~0d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/zilliztech/claude-context"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

Claude Context is an MCP (Model Context Protocol) plugin that integrates semantic code search into Claude Code and other AI coding agents, allowing them to efficiently search and retrieve relevant code snippets from massive codebases (millions of lines) using vector database embeddings stored in Zilliz Cloud. It solves the problem of expensive and inefficient context windows by replacing brute-force directory loading with targeted semantic search, dramatically reducing token costs for large codebase interactions. Monorepo structure with at least two main packages: @zilliz/claude-context-core (main semantic search engine, 535K TypeScript) and @zilliz/claude-context-mcp (MCP server wrapper). Supporting JavaScript/Python implementations (~83K JS, ~103K Python) for multi-language support. Documentation in docs/ with deep-dives on asynchronous indexing and file inclusion rules. Evaluation suite in evaluation/ with real case studies (Django, xarray).

Who it's for

AI coding agents (primarily Claude Code) and their users working on large enterprise codebases who need intelligent context retrieval without token bloat. Also relevant to development teams wanting to give AI assistants deep codebase understanding without manual navigation.

Maturity & risk

Actively developed and production-ready: published to npm as @zilliz/claude-context-core and @zilliz/claude-context-mcp, listed on VS Code Marketplace, has structured CI/CD pipelines (.github/workflows/ci.yml, release.yml), comprehensive documentation in docs/, and evaluation benchmarks. Recent activity shown through case studies in evaluation/case_study/ demonstrating real-world usage.

Moderate risk: depends on external Zilliz Cloud vector database (single external dependency point), TypeScript-heavy monorepo with multiple packages requiring coordination, requires API keys and environment setup (.env.example pattern suggests config complexity). However, MIT licensed, has example code patterns, and includes troubleshooting guides (docs/troubleshooting/) mitigating adoption friction.

Active areas of work

Active development visible through case study evaluation work (evaluation/case_study/ with django and xarray projects), recent focus on MCP efficiency analysis (assets/mcp_efficiency_analysis_chart.png), integration with VS Code extension marketplace, and companion project mentions (memsearch plugin for persistent memory). GitHub Actions workflows indicate ongoing CI/release automation.

Get running

git clone https://github.com/zilliztech/claude-context.git
cd claude-context
npm install
npm run dev

Then copy .env.example to .env and configure Zilliz Cloud API credentials (see docs/getting-started/environment-variables.md).

Daily commands:

npm run dev

for development with file watching (tsx --watch). Or build with npm run build and run with Node. Specific to MCP server: configure in Claude Code settings to use @zilliz/claude-context-mcp entry point.

Map of the codebase

packages/core/README.md — Core package entry point; documents the main SDK for code indexing and semantic search that all other packages depend on
packages/chrome-extension/src/background.ts — Chrome extension background service worker; handles communication between content script and Milvus vector database for the browser integration
evaluation/retrieval/custom.py — Custom retrieval implementation used in evaluation; demonstrates how the MCP server integrates with Claude for semantic code search
docs/dive-deep/asynchronous-indexing-workflow.md — Technical deep-dive on the async indexing pipeline; critical for understanding how files are processed and vectorized in the background
examples/basic-usage/index.ts — Reference example showing how to use the core SDK; essential for understanding the public API and integration patterns
.github/workflows/ci.yml — CI/CD pipeline definition; shows build, test, and release automation for all packages
evaluation/run_evaluation.py — Evaluation harness that benchmarks MCP efficiency; demonstrates real-world usage patterns and performance metrics

How to make changes

Add a new MCP retrieval tool

Create new server class extending base.py retrieval interface (evaluation/retrieval/base.py)
Implement search logic (e.g., keyword matching, semantic filtering) (evaluation/servers/grep_server.py)
Register tool in evaluation client and run_evaluation.py comparison harness (evaluation/run_evaluation.py)
Add benchmark case study results to evaluation/case_study/ (evaluation/case_study/django_14170/README.md)

Index a new repository with semantic search

Set MILVUS_URI, MILVUS_TOKEN, and repository path in environment (docs/getting-started/environment-variables.md)
Initialize SDK, define file inclusion rules, and call indexing API (examples/basic-usage/index.ts)
Monitor async indexing progress via workflow lifecycle hooks (docs/dive-deep/asynchronous-indexing-workflow.md)
Query indexed files using semantic search or keyword fallback (evaluation/retrieval/custom.py)

Extend Chrome extension with new search feature

Define new API method in chromeMilvusAdapter for backend communication (packages/chrome-extension/src/milvus/chromeMilvusAdapter.ts)
Update background service worker to handle new message type (packages/chrome-extension/src/background.ts)
Add UI component and event handlers in content script (packages/chrome-extension/src/content.ts)
Persist search history or settings using indexedRepoManager (packages/chrome-extension/src/storage/indexedRepoManager.ts)

Add a new evaluation metric or benchmark case

Create evaluation scenario JSON with test queries and expected results (evaluation/generate_subset_json.py)
Define metric computation logic in analyze_and_plot_mcp_efficiency.py (evaluation/analyze_and_plot_mcp_efficiency.py)
Add case study README documenting hypothesis, methodology, and results (evaluation/case_study/pydata_xarray_6938/README.md)
Update run_evaluation.py to include new metric in final report (evaluation/run_evaluation.py)

Why these technologies

Milvus Vector Database — Scalable, open-source vector search for semantic code embeddings; enables sub-second nearest-neighbor queries on large codebases
Model Context Protocol (MCP) — Standardized interface for Claude to invoke custom tools; decouples code search logic from LLM provider
TypeScript + Node.js — Type-safe monorepo structure with shared core; enables both server-side indexing and browser-based extensions
Chrome Extension — In-browser UI with IndexedDB persistence; allows offline access to indexed repositories without server round-trips
Python Evaluation Framework — Benchmarks MCP efficiency vs traditional grep using real case studies; validates semantic search ROI on multi-file queries

Trade-offs already made

Async background indexing vs synchronous on-demand embedding
- Why: Large codebases (10k+ files) would block if vectorized synchronously; background jobs enable non-blocking user experience
- Consequence: Index freshness lag (~seconds) acceptable for most workflows; requires eventual-consistency handling in query results
Milvus Cloud dependency vs self-hosted option
- Why: Cloud reduces DevOps burden; lowers barrier to entry for small teams
- Consequence: Vendor lock-in risk; requires API key management and network connectivity; evaluation includes self-hosted fallback (grep)
File-level vs line-level semantic chunking
- Why: File-level simplicity reduces embedding cost and indexing complexity
- Consequence: Large files (>1000 LOC) may dilute search relevance; grep fallback provides precision for keyword-exact matches
Browser extension with IndexedDB vs server-side session storage
- Why: IndexedDB enables offline search and reduces API calls; user data stays local
- Consequence: Storage quota limited (~50MB per domain); requires re-indexing for new repo versions; no cross-device sync

Non-goals (don't propose these)

Real-time file change detection and incremental indexing (batch async indexing only)
Multi-language code understanding (generic semantic embeddings, no AST-based analysis)
Built-in authentication/authorization (assumes Claude or extension context already authenticated)
IDE native integration beyond VS Code extension (Chrome extension scoped to web only)
Streaming large result sets (evaluation uses batch retrieval, single round-trip to Milvus)

Traps & gotchas

Zilliz Cloud account and API key are mandatory—local Vector DB not supported; sign up required before local development. 2) Environment variable configuration (.env) must match Zilliz Cloud cluster setup exactly; misconfiguration silently fails. 3) File inclusion rules use complex glob/regex patterns (file-inclusion-rules.md)—misconfigurations can cause massive indexing or exclude needed files. 4) Asynchronous indexing means freshness lag between file edits and search results—users must understand eventual consistency. 5) TypeScript monorepo uses workspace: protocol (*) in package.json—npm versions <7 may struggle; Node 20+ strongly recommended.

Architecture

Concepts to learn

Vector Database (Semantic Search) — Core enabling technology—Claude Context stores code as embeddings in Zilliz Cloud and retrieves semantically similar snippets rather than keyword matching, making search context-aware
Model Context Protocol (MCP) — Integration mechanism—MCP allows Claude Context to plug into Claude Code as a standardized tool, enabling AI agents to invoke semantic search as a primitive
Asynchronous Indexing — Performance pattern—codebase changes are indexed in the background without blocking the IDE, but creates eventual consistency; understanding this tradeoff is critical for user expectations
Glob Pattern Matching — File filtering mechanism—file-inclusion-rules.md shows that code search scope is controlled via glob/regex patterns; incorrect patterns cause indexing failures or scope leaks
Token Optimization — Cost driver—reduces Claude API costs by ~70% (claimed in README) by avoiding full-directory context and sending only relevant code; central motivation for the project
Embedding Models — Vectorization strategy—code snippets are converted to embeddings for semantic similarity comparison; quality of embeddings determines search relevance
Monorepo Workspace Management — Development structure—uses npm workspace: protocol to share code between core and MCP packages; misconfigured workspaces cause version conflicts

Related repos

zilliztech/memsearch — Companion project mentioned in README for persistent memory in Claude Code—provides long-term memory layer on top of claude-context
anthropics/python-sdk — Official Anthropic Python SDK for Claude integration—needed for building Python-based tools that work with Claude Context MCP
zilliztech/milvus — Open-source vector database that powers Zilliz Cloud—understanding Milvus query patterns helps optimize claude-context search performance
pytorch/pytorch — Underlying deep learning framework for embedding generation in semantic search—relevant for understanding how code is vectorized
karpathy/llm.c — Educational repo on LLM internals and tokenization—helps understand why token-cost reduction matters for large model context

PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for evaluation/retrieval modules

The evaluation framework contains base.py, custom.py, and retrieval logic for benchmarking the MCP server, but there are no visible test files in the repo. This is critical for a tool that claims to improve code search quality. New contributors can create pytest-based tests to validate retrieval accuracy, ranking, and edge cases before evaluation runs.

[ ] Create evaluation/tests/ directory with init.py
[ ] Add evaluation/tests/test_retrieval_base.py covering base.py retrieval methods
[ ] Add evaluation/tests/test_custom_retrieval.py for custom.py edge cases (file filtering, ranking)
[ ] Add evaluation/tests/test_file_management.py validating utils/file_management.py subset generation
[ ] Update evaluation/pyproject.toml to include pytest and pytest-cov as dev dependencies
[ ] Add CI workflow step in .github/workflows/ci.yml to run evaluation tests

Create TypeScript integration tests for the core MCP server

While the repo has examples and evaluation tooling, there are no visible TypeScript/integration tests verifying the MCP protocol implementation works end-to-end. The core package (@zilliz/claude-context-core) is referenced as a workspace dependency but lacks test coverage. This is essential for a production MCP tool.

[ ] Create src/tests/ directory (or tests/ at root level)
[ ] Add src/tests/mcp-server.test.ts testing tool registration and invocation
[ ] Add src/tests/file-indexing.test.ts validating indexing workflow from docs/dive-deep/indexing-sequence-diagram.png
[ ] Add src/tests/file-inclusion-rules.test.ts covering rules defined in docs/dive-deep/file-inclusion-rules.md
[ ] Update package.json with test script: 'jest' and add jest + @types/jest as devDependencies
[ ] Add test output to .gitignore (coverage/, .jest-cache/)

Document and add validation for environment variable configuration

The repo has docs/getting-started/environment-variables.md and .env.example, but there's no runtime validation or type-safe configuration loading. New contributors can add a configuration schema/validator that ensures required vars are set before the MCP server starts, improving the onboarding experience and reducing setup errors.

[ ] Create src/config.ts (or src/utils/config.ts) with zod or joi schema for .env validation
[ ] Add validation for all required variables mentioned in docs/getting-started/environment-variables.md
[ ] Add meaningful error messages that reference the documentation when validation fails
[ ] Update src/index.ts (or entry point) to call config validation on startup
[ ] Add src/tests/config.test.ts with test cases for missing/invalid variables
[ ] Update CONTRIBUTING.md with a section on environment setup requirements

Good first issues

Add comprehensive unit tests for file-inclusion rule parser (docs/dive-deep/file-inclusion-rules.md describes the logic, but test coverage gaps visible in absence of /test directory)—would validate glob/regex edge cases
Expand troubleshooting guide (docs/troubleshooting/troubleshooting-guide.md exists but is sparse) with common Zilliz Cloud connection errors and timeout scenarios, matching patterns seen in evaluation/case_study/ real-world runs
Create minimal Python example in examples/ alongside the basic TypeScript example—repo shows 102K Python codebase but no Python client example exists, despite @zilliz/claude-context-mcp potentially supporting multi-language

Top contributors

@zc277584121 — 58 commits
@Showiix — 10 commits
@Shawnzheng011019 — 6 commits
@BeamNawapat — 5 commits
@mvanhorn — 5 commits

Recent commits

1e6aae3 — chore: release 0.1.13 (zc277584121)
62323f4 — feat(mcp): configure background sync polling (#314) (txhno)
291863a — fix(mcp): cancel background indexing on clear_index (#199) (#369) (voidborne-d)
c93138b — feat(core): add Gemini Embedding 2 support (#366) (Showiix)
f794b8c — chore: release 0.1.12 (zc277584121)
0f7a82e — feat(core): add Solidity file support (#367) (zc277584121)
747ada5 — docs: fix Cherry Studio npx arguments (#368) (zc277584121)
0d558ff — docs: fix Cherry Studio npx arguments (Showiix)
cdad7ab — feat(core): add Solidity file support (Showiix)
ead19f4 — fix(mcp,core): honor request-scoped splitter option (#363) (Showiix)

Security observations

The codebase demonstrates a moderately secure posture with good overall structure and documentation. Primary concerns include: (1) reliance on environment variable security without visible enforcement mechanisms, (2) flexible dependency versioning that could introduce vulnerabilities, and (3) incomplete visibility into critical configuration files (.npmrc, full .env.example). The project lacks visible security scanning tools (npm audit integration), secrets management mechanisms, and explicit security documentation. Recommendations: implement automated dependency scanning (Dependabot/Snyk), add pre-commit hooks for .env protection, pin dependency versions, and document security best practices for contributors.

Medium · Sensitive Configuration in .env.example — .env.example. The .env.example file contains template values for sensitive credentials including API keys for OpenAI, VoyageAI, Gemini, and Ollama. While this is an example file, it demonstrates the structure of secrets that developers must protect. The presence of detailed comments about credential placement increases the risk of accidental commits of populated .env files. Fix: Ensure .env files are properly listed in .gitignore. Add pre-commit hooks to prevent accidental commits of .env files with actual credentials. Consider using a secrets management tool like HashiCorp Vault or AWS Secrets Manager for production deployments.
Medium · Dependency Version Flexibility in package.json — examples/basic-usage/package.json. The example package.json uses caret (^) version specifiers for dependencies like tsx (^4.0.0), typescript (^5.0.0), and dotenv (^16.0.0). This allows automatic updates to minor and patch versions, which could introduce security vulnerabilities or breaking changes without explicit review. Fix: Use exact version pinning (remove ^) for production dependencies, or implement a strict dependency review process with automated security scanning (e.g., npm audit, Snyk, Dependabot). Regularly audit and update dependencies consciously.
Medium · Workspace Dependencies Without Version Constraints — examples/basic-usage/package.json. The dependency '@zilliz/claude-context-core' uses 'workspace:*' which creates an implicit dependency on the internal workspace package. This could mask version conflicts or allow breaking changes to propagate without awareness. Fix: Document the workspace dependency model clearly. Implement CI/CD checks to validate workspace dependency compatibility. Consider explicit version constraints or semantic versioning for workspace packages.
Low · Missing Security-Related npm Configuration — .npmrc. The .npmrc file exists but its contents are not shown. Security-related configurations such as registry authentication, audit settings, and integrity verification are not visible. Fix: Configure .npmrc to enable: 'audit=true', 'audit-level=moderate', proper registry authentication with secure token storage (not hardcoded), and strict SSL verification settings.
Low · Incomplete .env.example Documentation — .env.example. The .env.example file is truncated in the provided context ('# OpenAI Configurat'). This incomplete documentation could lead to misconfiguration of security-sensitive settings. Fix: Complete and validate all environment variable documentation. Include security best practices for each variable. Add comments about required vs. optional fields and secure defaults.
Low · TypeScript Build Configuration Exposure — examples/basic-usage/package.json. The build script uses generic TypeScript compilation without explicit security-focused tsconfig settings. Missing strict mode and type-checking configurations could allow unsafe patterns. Fix: Create a tsconfig.json with 'strict: true', 'noImplicitAny: true', and other security-relevant compiler options. Verify all build outputs don't expose sensitive data or debugging information in production builds.

LLM-derived; treat as a starting point, not a security audit.

Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.