SilasMarvin/lsp-ai
LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.
Healthy across all four use cases
weakest axisPermissive license, no critical CVEs, actively maintained — safe to depend on.
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓5 active contributors
- ✓MIT licensed
- ✓CI configured
Show all 6 evidence items →Show less
- ✓Tests present
- ⚠Stale — last commit 1y ago
- ⚠Concentrated ownership — top contributor handles 79% of recent commits
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Healthy" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/silasmarvin/lsp-ai)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/silasmarvin/lsp-ai on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: SilasMarvin/lsp-ai
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/SilasMarvin/lsp-ai shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
GO — Healthy across all four use cases
- 5 active contributors
- MIT licensed
- CI configured
- Tests present
- ⚠ Stale — last commit 1y ago
- ⚠ Concentrated ownership — top contributor handles 79% of recent commits
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live SilasMarvin/lsp-ai
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/SilasMarvin/lsp-ai.
What it runs against: a local clone of SilasMarvin/lsp-ai — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in SilasMarvin/lsp-ai | Confirms the artifact applies here, not a fork |
| 2 | License is still MIT | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 515 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of SilasMarvin/lsp-ai. If you don't
# have one yet, run these first:
#
# git clone https://github.com/SilasMarvin/lsp-ai.git
# cd lsp-ai
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of SilasMarvin/lsp-ai and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "SilasMarvin/lsp-ai(\\.git)?\\b" \\
&& ok "origin remote is SilasMarvin/lsp-ai" \\
|| miss "origin remote is not SilasMarvin/lsp-ai (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
&& ok "license is MIT" \\
|| miss "license drift — was MIT at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "crates/lsp-ai/src/main.rs" \\
&& ok "crates/lsp-ai/src/main.rs" \\
|| miss "missing critical file: crates/lsp-ai/src/main.rs"
test -f "crates/lsp-ai/src/config.rs" \\
&& ok "crates/lsp-ai/src/config.rs" \\
|| miss "missing critical file: crates/lsp-ai/src/config.rs"
test -f "crates/lsp-ai/src/transformer_worker.rs" \\
&& ok "crates/lsp-ai/src/transformer_worker.rs" \\
|| miss "missing critical file: crates/lsp-ai/src/transformer_worker.rs"
test -f "crates/lsp-ai/src/memory_worker.rs" \\
&& ok "crates/lsp-ai/src/memory_worker.rs" \\
|| miss "missing critical file: crates/lsp-ai/src/memory_worker.rs"
test -f "crates/lsp-ai/src/custom_requests/mod.rs" \\
&& ok "crates/lsp-ai/src/custom_requests/mod.rs" \\
|| miss "missing critical file: crates/lsp-ai/src/custom_requests/mod.rs"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 515 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~485d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/SilasMarvin/lsp-ai"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
LSP-AI is a Language Server Protocol backend written in Rust that integrates AI-powered features (chat, code completions, custom actions) into any editor supporting LSP (VS Code, NeoVim, Emacs, Helix, Sublime). It serves as a self-hosted alternative to GitHub Copilot, supporting multiple LLM backends (OpenAI, Anthropic, Ollama, Mistral, Gemini, llama.cpp) and includes vector database integration for semantic code search. Monorepo with single crate at crates/lsp-ai/src/ organized by concern: transformer_backends/ (LLM providers: anthropic.rs, openai, ollama.rs, gemini.rs, llama_cpp/, mistral_fim.rs), memory_backends/ (vector stores: postgresml/, file_store.rs), embedding_models/ (ollama.rs), custom_requests/ (LSP handlers), splitters/ (code chunking: tree_sitter.rs, text_splitter.rs), plus main.rs as entrypoint and config.rs for TOML configuration.
👥Who it's for
Software engineers and programming teams who want AI-assisted development (completions, in-editor chat, refactoring) without vendor lock-in; users preferring self-hosted or local models over cloud-only solutions; editor enthusiasts using NeoVim or Emacs who lack native AI tooling.
🌱Maturity & risk
Production-ready and actively maintained but feature-complete: the maintainer states 'all the features I want for it' are implemented, and development is no longer adding new features. Single-crate Rust project with CI/CD workflows (GitHub Actions) and releases, indicating stability. Daily usage by multiple users confirms reliability.
Standard open source risks apply.
Active areas of work
Project is in maintenance mode per README. No specific recent changes visible from file structure alone, but release workflow exists (.github/workflows/release.yml). Last commit timestamp not provided in repo data, so check GitHub Actions and commit history. Likely focused on dependency updates and bug fixes rather than new feature development.
🚀Get running
git clone https://github.com/SilasMarvin/lsp-ai.git
cd lsp-ai
cargo build --release
cargo run --release
Binary will be available at target/release/lsp-ai. Configure via TOML (see Cargo.toml and config.rs for options).
Daily commands:
cargo build --release
cargo run --release
Server listens on stdio by default (LSP mode). Editor client (VS Code extension, NeoVim plugin) connects and sends LSP requests. Configuration via ~/.config/lsp-ai/config.toml or project-level config (see config.rs for schema).
🗺️Map of the codebase
crates/lsp-ai/src/main.rs— Entry point and LSP server initialization; all contributors must understand the server lifecycle and message routing.crates/lsp-ai/src/config.rs— Configuration schema and parsing; critical for understanding how users configure backends, models, and features.crates/lsp-ai/src/transformer_worker.rs— Core AI inference orchestration; handles routing to LLM backends (OpenAI, Ollama, Anthropic, etc.).crates/lsp-ai/src/memory_worker.rs— Vector database and embedding workflow; manages document indexing and retrieval for context.crates/lsp-ai/src/custom_requests/mod.rs— LSP custom request definitions; defines the contract between editors and the language server.crates/lsp-ai/src/transformer_backends/mod.rs— Abstraction layer for multiple LLM providers; essential for understanding backend extensibility.Cargo.toml— Workspace and dependency management; defines the architecture of crates and external integrations.
🛠️How to make changes
Add a new LLM backend provider
- Create a new file in crates/lsp-ai/src/transformer_backends/ (e.g., my_provider.rs) implementing the TransformerBackend trait (
crates/lsp-ai/src/transformer_backends/mod.rs) - Add the provider variant to the TransformerBackend enum match in mod.rs and add its module declaration (
crates/lsp-ai/src/transformer_backends/mod.rs) - Extend the config.rs schema with provider-specific settings (API keys, endpoints, parameters) (
crates/lsp-ai/src/config.rs) - Update transformer_worker.rs to instantiate the new backend based on config selection (
crates/lsp-ai/src/transformer_worker.rs) - Add example config files in examples/ (helix, nvim, etc.) demonstrating the new provider (
examples/helix/openai-chat-code-completion.toml)
Add a new custom LSP request type
- Define the request struct with params and result types in crates/lsp-ai/src/custom_requests/ (
crates/lsp-ai/src/custom_requests/mod.rs) - Implement the request handler in main.rs and route it through the message dispatch (
crates/lsp-ai/src/main.rs) - Update the VSCode extension to send the new request and handle responses (
editors/vscode/src/index.ts) - Add integration tests demonstrating the new request/response flow (
crates/lsp-ai/tests/integration_tests.rs)
Add support for a new embedding model provider
- Create a new file in crates/lsp-ai/src/embedding_models/ implementing the EmbeddingModel trait (
crates/lsp-ai/src/embedding_models/mod.rs) - Extend config.rs to accept the new embedding provider selection and parameters (
crates/lsp-ai/src/config.rs) - Update memory_worker.rs to instantiate the correct embedding model based on config (
crates/lsp-ai/src/memory_worker.rs) - Add example configs showing the new embedding provider alongside vector store options (
examples/helix/anthropic-in-editor-chatting.toml)
Add support for a new vector database backend
- Implement the VectorStore trait in crates/lsp-ai/src/memory_backends/ (
crates/lsp-ai/src/memory_backends/mod.rs) - Add configuration schema for the new backend (connection strings, credentials) in config.rs (
crates/lsp-ai/src/config.rs) - Update memory_worker.rs to initialize the correct vector store based on config (
crates/lsp-ai/src/memory_worker.rs) - Create or update example configs demonstrating the new vector store (
examples/helix/llama-cpp-fim-code-completion.toml)
🔧Why these technologies
- Rust + Tokio — High-performance, safe concurrent server handling multiple editor clients; efficient async I/O for AI model calls
- Language Server Protocol (LSP) — Standardized, editor-agnostic interface; enables single server backend to support VSCode, Neovim, Helix, and other LSP-capable editors
- Multiple LLM backends (OpenAI, Ollama, Anthropic, Llama.cpp) — Flexibility; users can choose cloud-hosted or local models based on privacy/latency/cost preferences
- Tree-sitter — Syntax-aware document splitting; preserves code structure for better context chunking vs. naive text splitting
- PostgresML + file-based vector stores — Dual support for lightweight (file)
🪤Traps & gotchas
LLM API keys: Code expects environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) or config file entries—missing keys silently fail requests. PostgreSQL/pgvectorml backend: Requires running PostgreSQL instance with pgvector extension; file_store.rs is simpler fallback but doesn't scale. Tree-sitter grammar files: tree_sitter.rs depends on language-specific grammars; adding language support requires embedding grammar binaries. Streaming response handling: generation_stream.rs uses Server-Sent Events; some editors may not support streaming requests properly. Local model inference (llama.cpp): Requires compiled ggml binaries; cross-platform compilation (macOS/Linux/Windows) has quirks in build script. Rate limiting: No built-in rate limiting on LLM calls; token-counting for billing is backend-specific.
🏗️Architecture
💡Concepts to learn
- Language Server Protocol (LSP) — LSP-AI's entire architecture is built on LSP—understanding the protocol (request/response lifecycle, custom requests, capabilities) is essential to extending or debugging the server
- Vector Embeddings & Semantic Search — LSP-AI uses embeddings (via Ollama or pgvector) to find relevant code snippets for context injection into LLM prompts; underpins the code search and in-editor chat features
- Streaming Response Handling (Server-Sent Events) — generation_stream.rs implements SSE-style streaming for real-time chat and completion output; critical for responsive UX and understanding latency-hiding patterns
- Abstract Syntax Tree (AST) & Tree-Sitter Parsing — splitters/tree_sitter.rs uses AST-based code splitting instead of naive line-splitting; essential for respecting language semantics and generating coherent code chunks
- Async/Await & Tokio Runtime — transformer_worker.rs and memory_worker.rs use Rust async patterns with tokio; understanding task spawning, channel communication, and blocking I/O is critical for debugging latency or deadlocks
- Trait Objects & Pluggable Backends — transformer_backends/ and memory_backends/ use Rust trait pattern for polymorphism; adding new LLM providers or vector stores requires implementing these traits, not copy-pasting
- Prompt Templating & Context Injection — template.rs likely handles dynamic prompt construction with code snippets, chat history, and user input—key to prompt quality and token budgeting for LLM calls
🔗Related repos
github/copilot-cli— Closed-source GitHub Copilot—primary commercial alternative LSP-AI competes against as self-hosted solutiontabbyml/tabby— Open-source AI code completion server with self-hosted focus; similar value proposition, different architecture (Python-first, built on Open-LLM)ollama/ollama— Local LLM runtime that LSP-AI integrates with viatransformer_backends/ollama.rs—powers local model inference used by LSP-AIpgvector/pgvector— PostgreSQL extension for vector search that LSP-AI uses inmemory_backends/postgresml/for semantic code retrievaltree-sitter/tree-sitter— Parser generator library that LSP-AI relies on insplitters/tree_sitter.rsfor syntax-aware code chunking and semantic understanding
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add integration tests for transformer_backends (Ollama, OpenAI, Anthropic, Gemini)
The crates/lsp-ai/tests/integration_tests.rs file exists but is likely minimal. The transformer_backends directory has multiple backend implementations (ollama.rs, open_ai/mod.rs, anthropic.rs, gemini.rs) that deserve dedicated integration tests. This ensures each backend correctly handles API calls, streaming responses, and error cases. Given that these are critical user-facing features, comprehensive tests would catch regressions early.
- [ ] Review existing crates/lsp-ai/tests/integration_tests.rs to understand current test structure
- [ ] Create mock/stub implementations or use test fixtures for each backend in crates/lsp-ai/src/transformer_backends/
- [ ] Write integration tests covering: successful inference, streaming generation, API error handling, and malformed responses for each backend (ollama, open_ai, anthropic, gemini)
- [ ] Add CI step to .github/workflows/release.yml to run integration tests (with appropriate feature flags to avoid requiring real API keys)
Add unit tests for embedding_models backends (Ollama embeddings)
The crates/lsp-ai/src/embedding_models/ directory contains ollama.rs and vector_store.rs, but there are no corresponding tests visible in the file structure. Embedding models are foundational for the memory/RAG system. Unit tests should cover embedding generation, dimension validation, vector storage operations, and error handling when the Ollama service is unavailable or returns invalid embeddings.
- [ ] Create crates/lsp-ai/src/embedding_models/tests/ or add #[cfg(test)] modules to ollama.rs
- [ ] Write tests for: successful embedding generation, batch embedding, empty input handling, dimension mismatch errors, and Ollama connection failures
- [ ] Test vector_store.rs operations: insert, search, delete, and edge cases like searching with zero results
- [ ] Ensure tests can run without requiring a running Ollama instance (use mocks/stubs)
Add missing unit tests for config.rs parsing and validation
The crates/lsp-ai/src/config.rs file is critical—it deserializes user configuration for all backends, memory stores, and embeddings. Without tests, config changes risk breaking user setups silently. Tests should validate TOML parsing, required vs optional fields, invalid backend selections, and backward compatibility. This is especially important given the multi-backend architecture.
- [ ] Create crates/lsp-ai/src/config.rs with embedded test module or separate tests/config_tests.rs
- [ ] Write tests for valid config deserialization (test fixtures in tests/fixtures/ for different backend combos: ollama-only, openai-only, mixed)
- [ ] Test validation: missing required fields, invalid backend names, conflicting settings
- [ ] Test each transformer backend config: ollama, open_ai, anthropic, gemini with valid/invalid parameters
- [ ] Test memory backend configs: file_store and postgresml with edge cases
🌿Good first issues
- Add test suite for
crates/lsp-ai/src/splitters/text_splitter.rsandtree_sitter.rs—currently no test coverage visible in file structure; create unit tests validating chunk boundaries and semantic splitting correctness for at least 3 languages. - Document configuration schema: create a generated schema.json from
config.rsusing serde_json and add example configs to docs for each backend (OpenAI, Anthropic, Ollama, Mistral, Gemini, llama.cpp). Currently only wiki has examples. - Refactor memory_worker.rs to add explicit error recovery and retry logic for vector database failures—currently likely panics or logs but doesn't gracefully degrade. Add configurable retry counts and exponential backoff.
⭐Top contributors
Click to expand
Top contributors
- @SilasMarvin — 79 commits
- @McPatate — 18 commits
- @ProjectInitiative — 1 commits
- @zaytsev — 1 commits
- @eltociear — 1 commits
📝Recent commits
Click to expand
Recent commits
1e910a8— Merge pull request #91 from SilasMarvin/SilasMarvin-patch-1 (SilasMarvin)cce1119— Delete crates/utils-tree-sitter/lsp-ai-chat.md (SilasMarvin)96dcb86— Merge pull request #89 from SilasMarvin/SilasMarvin-patch-1 (SilasMarvin)0da93ba— Update README.md (SilasMarvin)7cf9477— Added nvim example (SilasMarvin)9cb78a7— Merge pull request #79 from SilasMarvin/release/v0.7.1 (SilasMarvin)f2a68f9— Fix tag issue / bump version (SilasMarvin)c1fc6be— Merge pull request #76 from SilasMarvin/release/v0.7.0 (SilasMarvin)476a13e— Merge pull request #75 from ProjectInitiative/patch-1 (SilasMarvin)7201ead— Update Cargo.lock (ProjectInitiative)
🔒Security observations
- High · Outdated npm Dependencies with Known Vulnerabilities —
.github/actions/github-release/package.json. The GitHub release action's package.json uses '@actions/core': '^1.6', '@actions/github': '^5.0', and 'glob': '^7.1.5'. These versions are significantly outdated and likely contain known security vulnerabilities. The glob package v7 in particular has multiple documented CVEs related to ReDoS (Regular Expression Denial of Service) attacks. Fix: Update all dependencies to their latest versions: '@actions/core' to '^1.10+', '@actions/github' to '^6.0+', and 'glob' to '^10.0+'. Run 'npm audit' and 'npm audit fix' to identify and remediate all known vulnerabilities. - High · Potential API Key/Credential Exposure in Configuration —
crates/lsp-ai/src/transformer_backends/, crates/lsp-ai/src/embedding_models/, crates/lsp-ai/src/config.rs. The codebase includes multiple transformer backends (OpenAI, Anthropic, Gemini, Ollama, etc.) and embedding models that likely require API keys or credentials for operation. No explicit evidence of secrets management (.env files with gitignore, environment variable documentation, or secrets management integration) is visible in the provided file structure. Fix: Implement proper secrets management: (1) Use environment variables for all sensitive credentials, (2) Ensure .env files are in .gitignore, (3) Document credential handling in README, (4) Consider using a secrets manager for deployment, (5) Implement credential validation without exposing values in logs. - Medium · Potential SQL Injection Risk in PostgreSQL Backend —
crates/lsp-ai/src/memory_backends/postgresml/mod.rs. The codebase includes a PostgresML memory backend at 'crates/lsp-ai/src/memory_backends/postgresml/mod.rs'. Without reviewing the actual implementation, dynamic query construction or string concatenation with user input could create SQL injection vulnerabilities. Fix: Ensure all database queries use parameterized queries or prepared statements exclusively. Avoid string concatenation for SQL query construction. Use an ORM or query builder that provides SQL injection protection. Implement input validation and sanitization for all user-supplied data before database operations. - Medium · Unrestricted Network Communication with External AI Services —
crates/lsp-ai/src/transformer_backends/. The transformer backends communicate with external services (OpenAI, Anthropic, Gemini, Ollama, Mistral, LlamaCpp). Without visible certificate pinning, timeout configurations, or request validation, the application may be vulnerable to MITM attacks, infinite loops, or resource exhaustion attacks. Fix: Implement: (1) Certificate pinning for HTTPS connections to external APIs, (2) Request timeouts and size limits, (3) Rate limiting for API calls, (4) Response validation and sanitization, (5) Retry logic with exponential backoff, (6) Proxy/VPN support for enterprise environments. - Medium · Potential Path Traversal in File Store Backend —
crates/lsp-ai/src/memory_backends/file_store.rs. The file_store backend at 'crates/lsp-ai/src/memory_backends/file_store.rs' may be vulnerable to path traversal attacks if file paths are constructed from user input without proper validation. Fix: Implement strict path validation: (1) Use a whitelist of allowed directories, (2) Canonicalize all paths and verify they remain within allowed boundaries, (3) Reject paths containing '..', '~', or symlinks, (4) Use secure file operations provided by the language/framework, (5) Run with minimal file system permissions. - Medium · Missing Security Headers in Web Communications —
editors/vscode/src/index.ts, crates/lsp-ai/src/main.rs. The VS Code editor client at 'editors/vscode/src/index.ts' and the LSP server communicate over various protocols. There is no visible evidence of security headers implementation (CSP, X-Frame-Options, etc.) or secure WebSocket usage. Fix: Implement security headers for all HTTP/WebSocket communications: (1) Use WSS (WebSocket Secure) instead of WS, (2) Validate TLS certificates, (3) Implement
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.