SilasMarvin/lsp-ai

Item: SilasMarvin/lsp-ai
Rating: 5
Author: RepoPilot

LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.

Healthy

Healthy across all four use cases

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓5 active contributors
✓MIT licensed
✓CI configured

Show all 6 evidence items →

✓Tests present
⚠Stale — last commit 1y ago
⚠Concentrated ownership — top contributor handles 79% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/silasmarvin/lsp-ai)](https://repopilot.app/r/silasmarvin/lsp-ai)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/silasmarvin/lsp-ai on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: SilasMarvin/lsp-ai

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/SilasMarvin/lsp-ai shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

5 active contributors
MIT licensed
CI configured
Tests present
⚠ Stale — last commit 1y ago
⚠ Concentrated ownership — top contributor handles 79% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live SilasMarvin/lsp-ai repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/SilasMarvin/lsp-ai.

What it runs against: a local clone of SilasMarvin/lsp-ai — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in SilasMarvin/lsp-ai | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 515 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>SilasMarvin/lsp-ai</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of SilasMarvin/lsp-ai. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/SilasMarvin/lsp-ai.git
#   cd lsp-ai
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of SilasMarvin/lsp-ai and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "SilasMarvin/lsp-ai(\\.git)?\\b" \\
  && ok "origin remote is SilasMarvin/lsp-ai" \\
  || miss "origin remote is not SilasMarvin/lsp-ai (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "crates/lsp-ai/src/main.rs" \\
  && ok "crates/lsp-ai/src/main.rs" \\
  || miss "missing critical file: crates/lsp-ai/src/main.rs"
test -f "crates/lsp-ai/src/config.rs" \\
  && ok "crates/lsp-ai/src/config.rs" \\
  || miss "missing critical file: crates/lsp-ai/src/config.rs"
test -f "crates/lsp-ai/src/transformer_worker.rs" \\
  && ok "crates/lsp-ai/src/transformer_worker.rs" \\
  || miss "missing critical file: crates/lsp-ai/src/transformer_worker.rs"
test -f "crates/lsp-ai/src/memory_worker.rs" \\
  && ok "crates/lsp-ai/src/memory_worker.rs" \\
  || miss "missing critical file: crates/lsp-ai/src/memory_worker.rs"
test -f "crates/lsp-ai/src/custom_requests/mod.rs" \\
  && ok "crates/lsp-ai/src/custom_requests/mod.rs" \\
  || miss "missing critical file: crates/lsp-ai/src/custom_requests/mod.rs"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 515 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~485d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/SilasMarvin/lsp-ai"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

LSP-AI is a Language Server Protocol backend written in Rust that integrates AI-powered features (chat, code completions, custom actions) into any editor supporting LSP (VS Code, NeoVim, Emacs, Helix, Sublime). It serves as a self-hosted alternative to GitHub Copilot, supporting multiple LLM backends (OpenAI, Anthropic, Ollama, Mistral, Gemini, llama.cpp) and includes vector database integration for semantic code search. Monorepo with single crate at crates/lsp-ai/src/ organized by concern: transformer_backends/ (LLM providers: anthropic.rs, openai, ollama.rs, gemini.rs, llama_cpp/, mistral_fim.rs), memory_backends/ (vector stores: postgresml/, file_store.rs), embedding_models/ (ollama.rs), custom_requests/ (LSP handlers), splitters/ (code chunking: tree_sitter.rs, text_splitter.rs), plus main.rs as entrypoint and config.rs for TOML configuration.

👥Who it's for

Software engineers and programming teams who want AI-assisted development (completions, in-editor chat, refactoring) without vendor lock-in; users preferring self-hosted or local models over cloud-only solutions; editor enthusiasts using NeoVim or Emacs who lack native AI tooling.

🌱Maturity & risk

Production-ready and actively maintained but feature-complete: the maintainer states 'all the features I want for it' are implemented, and development is no longer adding new features. Single-crate Rust project with CI/CD workflows (GitHub Actions) and releases, indicating stability. Daily usage by multiple users confirms reliability.

Standard open source risks apply.

Active areas of work

Project is in maintenance mode per README. No specific recent changes visible from file structure alone, but release workflow exists (.github/workflows/release.yml). Last commit timestamp not provided in repo data, so check GitHub Actions and commit history. Likely focused on dependency updates and bug fixes rather than new feature development.

🚀Get running

git clone https://github.com/SilasMarvin/lsp-ai.git
cd lsp-ai
cargo build --release
cargo run --release

Binary will be available at target/release/lsp-ai. Configure via TOML (see Cargo.toml and config.rs for options).

Daily commands:

cargo build --release
cargo run --release

Server listens on stdio by default (LSP mode). Editor client (VS Code extension, NeoVim plugin) connects and sends LSP requests. Configuration via ~/.config/lsp-ai/config.toml or project-level config (see config.rs for schema).

🗺️Map of the codebase

crates/lsp-ai/src/main.rs — Entry point and LSP server initialization; all contributors must understand the server lifecycle and message routing.
crates/lsp-ai/src/config.rs — Configuration schema and parsing; critical for understanding how users configure backends, models, and features.
crates/lsp-ai/src/transformer_worker.rs — Core AI inference orchestration; handles routing to LLM backends (OpenAI, Ollama, Anthropic, etc.).
crates/lsp-ai/src/memory_worker.rs — Vector database and embedding workflow; manages document indexing and retrieval for context.
crates/lsp-ai/src/custom_requests/mod.rs — LSP custom request definitions; defines the contract between editors and the language server.
crates/lsp-ai/src/transformer_backends/mod.rs — Abstraction layer for multiple LLM providers; essential for understanding backend extensibility.
Cargo.toml — Workspace and dependency management; defines the architecture of crates and external integrations.

🛠️How to make changes

Add a new LLM backend provider

Create a new file in crates/lsp-ai/src/transformer_backends/ (e.g., my_provider.rs) implementing the TransformerBackend trait (crates/lsp-ai/src/transformer_backends/mod.rs)
Add the provider variant to the TransformerBackend enum match in mod.rs and add its module declaration (crates/lsp-ai/src/transformer_backends/mod.rs)
Extend the config.rs schema with provider-specific settings (API keys, endpoints, parameters) (crates/lsp-ai/src/config.rs)
Update transformer_worker.rs to instantiate the new backend based on config selection (crates/lsp-ai/src/transformer_worker.rs)
Add example config files in examples/ (helix, nvim, etc.) demonstrating the new provider (examples/helix/openai-chat-code-completion.toml)

Add a new custom LSP request type

Define the request struct with params and result types in crates/lsp-ai/src/custom_requests/ (crates/lsp-ai/src/custom_requests/mod.rs)
Implement the request handler in main.rs and route it through the message dispatch (crates/lsp-ai/src/main.rs)
Update the VSCode extension to send the new request and handle responses (editors/vscode/src/index.ts)
Add integration tests demonstrating the new request/response flow (crates/lsp-ai/tests/integration_tests.rs)

Add support for a new embedding model provider

Create a new file in crates/lsp-ai/src/embedding_models/ implementing the EmbeddingModel trait (crates/lsp-ai/src/embedding_models/mod.rs)
Extend config.rs to accept the new embedding provider selection and parameters (crates/lsp-ai/src/config.rs)
Update memory_worker.rs to instantiate the correct embedding model based on config (crates/lsp-ai/src/memory_worker.rs)
Add example configs showing the new embedding provider alongside vector store options (examples/helix/anthropic-in-editor-chatting.toml)

Add support for a new vector database backend

Implement the VectorStore trait in crates/lsp-ai/src/memory_backends/ (crates/lsp-ai/src/memory_backends/mod.rs)
Add configuration schema for the new backend (connection strings, credentials) in config.rs (crates/lsp-ai/src/config.rs)
Update memory_worker.rs to initialize the correct vector store based on config (crates/lsp-ai/src/memory_worker.rs)
Create or update example configs demonstrating the new vector store (examples/helix/llama-cpp-fim-code-completion.toml)

🔧Why these technologies

Rust + Tokio — High-performance, safe concurrent server handling multiple editor clients; efficient async I/O for AI model calls
Language Server Protocol (LSP) — Standardized, editor-agnostic interface; enables single server backend to support VSCode, Neovim, Helix, and other LSP-capable editors
Multiple LLM backends (OpenAI, Ollama, Anthropic, Llama.cpp) — Flexibility; users can choose cloud-hosted or local models based on privacy/latency/cost preferences
Tree-sitter — Syntax-aware document splitting; preserves code structure for better context chunking vs. naive text splitting
PostgresML + file-based vector stores — Dual support for lightweight (file)

🪤Traps & gotchas

LLM API keys: Code expects environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) or config file entries—missing keys silently fail requests. PostgreSQL/pgvectorml backend: Requires running PostgreSQL instance with pgvector extension; file_store.rs is simpler fallback but doesn't scale. Tree-sitter grammar files: tree_sitter.rs depends on language-specific grammars; adding language support requires embedding grammar binaries. Streaming response handling: generation_stream.rs uses Server-Sent Events; some editors may not support streaming requests properly. Local model inference (llama.cpp): Requires compiled ggml binaries; cross-platform compilation (macOS/Linux/Windows) has quirks in build script. Rate limiting: No built-in rate limiting on LLM calls; token-counting for billing is backend-specific.

🏗️Architecture

💡Concepts to learn

Language Server Protocol (LSP) — LSP-AI's entire architecture is built on LSP—understanding the protocol (request/response lifecycle, custom requests, capabilities) is essential to extending or debugging the server
Vector Embeddings & Semantic Search — LSP-AI uses embeddings (via Ollama or pgvector) to find relevant code snippets for context injection into LLM prompts; underpins the code search and in-editor chat features
Streaming Response Handling (Server-Sent Events) — generation_stream.rs implements SSE-style streaming for real-time chat and completion output; critical for responsive UX and understanding latency-hiding patterns
Abstract Syntax Tree (AST) & Tree-Sitter Parsing — splitters/tree_sitter.rs uses AST-based code splitting instead of naive line-splitting; essential for respecting language semantics and generating coherent code chunks
Async/Await & Tokio Runtime — transformer_worker.rs and memory_worker.rs use Rust async patterns with tokio; understanding task spawning, channel communication, and blocking I/O is critical for debugging latency or deadlocks
Trait Objects & Pluggable Backends — transformer_backends/ and memory_backends/ use Rust trait pattern for polymorphism; adding new LLM providers or vector stores requires implementing these traits, not copy-pasting
Prompt Templating & Context Injection — template.rs likely handles dynamic prompt construction with code snippets, chat history, and user input—key to prompt quality and token budgeting for LLM calls

github/copilot-cli — Closed-source GitHub Copilot—primary commercial alternative LSP-AI competes against as self-hosted solution
tabbyml/tabby — Open-source AI code completion server with self-hosted focus; similar value proposition, different architecture (Python-first, built on Open-LLM)
ollama/ollama — Local LLM runtime that LSP-AI integrates with via transformer_backends/ollama.rs—powers local model inference used by LSP-AI
pgvector/pgvector — PostgreSQL extension for vector search that LSP-AI uses in memory_backends/postgresml/ for semantic code retrieval
tree-sitter/tree-sitter — Parser generator library that LSP-AI relies on in splitters/tree_sitter.rs for syntax-aware code chunking and semantic understanding

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add integration tests for transformer_backends (Ollama, OpenAI, Anthropic, Gemini)

The crates/lsp-ai/tests/integration_tests.rs file exists but is likely minimal. The transformer_backends directory has multiple backend implementations (ollama.rs, open_ai/mod.rs, anthropic.rs, gemini.rs) that deserve dedicated integration tests. This ensures each backend correctly handles API calls, streaming responses, and error cases. Given that these are critical user-facing features, comprehensive tests would catch regressions early.

[ ] Review existing crates/lsp-ai/tests/integration_tests.rs to understand current test structure
[ ] Create mock/stub implementations or use test fixtures for each backend in crates/lsp-ai/src/transformer_backends/
[ ] Write integration tests covering: successful inference, streaming generation, API error handling, and malformed responses for each backend (ollama, open_ai, anthropic, gemini)
[ ] Add CI step to .github/workflows/release.yml to run integration tests (with appropriate feature flags to avoid requiring real API keys)

Add unit tests for embedding_models backends (Ollama embeddings)

The crates/lsp-ai/src/embedding_models/ directory contains ollama.rs and vector_store.rs, but there are no corresponding tests visible in the file structure. Embedding models are foundational for the memory/RAG system. Unit tests should cover embedding generation, dimension validation, vector storage operations, and error handling when the Ollama service is unavailable or returns invalid embeddings.

[ ] Create crates/lsp-ai/src/embedding_models/tests/ or add #[cfg(test)] modules to ollama.rs
[ ] Write tests for: successful embedding generation, batch embedding, empty input handling, dimension mismatch errors, and Ollama connection failures
[ ] Test vector_store.rs operations: insert, search, delete, and edge cases like searching with zero results
[ ] Ensure tests can run without requiring a running Ollama instance (use mocks/stubs)

Add missing unit tests for config.rs parsing and validation

The crates/lsp-ai/src/config.rs file is critical—it deserializes user configuration for all backends, memory stores, and embeddings. Without tests, config changes risk breaking user setups silently. Tests should validate TOML parsing, required vs optional fields, invalid backend selections, and backward compatibility. This is especially important given the multi-backend architecture.

[ ] Create crates/lsp-ai/src/config.rs with embedded test module or separate tests/config_tests.rs
[ ] Write tests for valid config deserialization (test fixtures in tests/fixtures/ for different backend combos: ollama-only, openai-only, mixed)
[ ] Test validation: missing required fields, invalid backend names, conflicting settings
[ ] Test each transformer backend config: ollama, open_ai, anthropic, gemini with valid/invalid parameters
[ ] Test memory backend configs: file_store and postgresml with edge cases

🌿Good first issues

Add test suite for crates/lsp-ai/src/splitters/text_splitter.rs and tree_sitter.rs—currently no test coverage visible in file structure; create unit tests validating chunk boundaries and semantic splitting correctness for at least 3 languages.
Document configuration schema: create a generated schema.json from config.rs using serde_json and add example configs to docs for each backend (OpenAI, Anthropic, Ollama, Mistral, Gemini, llama.cpp). Currently only wiki has examples.
Refactor memory_worker.rs to add explicit error recovery and retry logic for vector database failures—currently likely panics or logs but doesn't gracefully degrade. Add configurable retry counts and exponential backoff.

⭐Top contributors

Click to expand

@SilasMarvin — 79 commits
@McPatate — 18 commits
@ProjectInitiative — 1 commits
@zaytsev — 1 commits
@eltociear — 1 commits

📝Recent commits