simonw/llm

Item: simonw/llm
Rating: 5
Author: RepoPilot

Access large language models from the command-line

Healthy

Healthy across all four use cases

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 1d ago
✓6 active contributors
✓Apache-2.0 licensed

Show all 6 evidence items →

✓CI configured
✓Tests present
⚠Single-maintainer risk — top contributor 93% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/simonw/llm)](https://repopilot.app/r/simonw/llm)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/simonw/llm on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: simonw/llm

Generated by RepoPilot · 2026-05-07 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/simonw/llm shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

Last commit 1d ago
6 active contributors
Apache-2.0 licensed
CI configured
Tests present
⚠ Single-maintainer risk — top contributor 93% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live simonw/llm repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/simonw/llm.

What it runs against: a local clone of simonw/llm — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in simonw/llm | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 31 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>simonw/llm</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of simonw/llm. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/simonw/llm.git
#   cd llm
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of simonw/llm and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "simonw/llm(\\.git)?\\b" \\
  && ok "origin remote is simonw/llm" \\
  || miss "origin remote is not simonw/llm (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "llm/cli.py" \\
  && ok "llm/cli.py" \\
  || miss "missing critical file: llm/cli.py"
test -f "llm/models.py" \\
  && ok "llm/models.py" \\
  || miss "missing critical file: llm/models.py"
test -f "llm/plugins.py" \\
  && ok "llm/plugins.py" \\
  || miss "missing critical file: llm/plugins.py"
test -f "llm/hookspecs.py" \\
  && ok "llm/hookspecs.py" \\
  || miss "missing critical file: llm/hookspecs.py"
test -f "llm/default_plugins/openai_models.py" \\
  && ok "llm/default_plugins/openai_models.py" \\
  || miss "missing critical file: llm/default_plugins/openai_models.py"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 31 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~1d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/simonw/llm"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

LLM is a CLI tool and Python library that provides unified access to dozens of large language models (OpenAI, Anthropic Claude, Google Gemini, Meta Llama, etc.) both via remote APIs and locally-installed models. It enables command-line prompt execution, SQLite-based logging of conversations, embeddings generation, structured content extraction via schemas, and tool-use capabilities for models. Monolithic structure with llm/ as the core package containing init.py, main.py, cli.py, and default_plugins/. Documentation lives separately in docs/ with Sphinx+Markdown generation (uses sphinx-markdown-builder and cogapp). The plugin system is hook-based rather than file-tree-based; plugins are discovered and registered dynamically.

👥Who it's for

Data scientists, CLI power users, and developers who want programmatic or command-line access to multiple LLM providers without writing provider-specific integration code. Also plugin developers extending LLM's capabilities via the plugin system.

🌱Maturity & risk

Production-ready and actively maintained. The project has comprehensive documentation (docs/ folder with setup, usage, plugins, embeddings, schemas guides), CI/CD via GitHub Actions (.github/workflows/test.yml, publish.yml, stable-docs.yml), and is published to PyPI with Homebrew support. The codebase is primarily Python (754KB) with a clean single-maintainer structure (simonw).

Single maintainer (simonw) is a concentration risk. The plugin system (docs/plugins/) means stability depends on third-party plugin quality. No specific information about dependency count or last commit date in provided data, but the presence of dependabot.yml suggests active dependency management. The architecture heavily relies on plugin hooks, so breaking changes to core APIs could impact the ecosystem.

Active areas of work

Documentation is actively maintained and versioned (stable-docs.yml workflow, readthedocs.yaml config, changelog.md tracking releases). The codebase includes recent additions for embeddings (docs/embeddings/ folder), schemas/structured extraction (docs/schemas.md), and tool-use (docs/tools.md). Dependabot is active for dependency updates.

🚀Get running

git clone https://github.com/simonw/llm.git && cd llm && pip install -e . (or use pipx install llm, or brew install llm). For development: install with editable flag and run tests via the workflow in .github/workflows/test.yml.

Daily commands: For CLI usage: llm prompt "your prompt here" (after install). For local development: python -m llm to invoke main.py. Documentation is generated via sphinx-build (as shown in README cog snippet): sphinx-build -M markdown ./docs ./tmpdir. Tests likely run via GitHub Actions workflow defined in .github/workflows/test.yml.

🗺️Map of the codebase

llm/cli.py — Main command-line interface entry point handling all user-facing CLI commands and argument parsing
llm/models.py — Core Model and ModelResponse abstractions that all LLM implementations extend
llm/plugins.py — Plugin system architecture enabling extensibility for models, tools, and embeddings
llm/hookspecs.py — Hook specifications defining all plugin extension points (register_models, register_tools, etc.)
llm/default_plugins/openai_models.py — Reference implementation of the Model abstraction for OpenAI API integration
llm/tools.py — Tool system and agentic capabilities framework for function calling
llm/__init__.py — Public API exports and version management for library consumers

🛠️How to make changes

Add a new Model implementation

Create a new file under llm/default_plugins/ (e.g., my_model.py) with a class inheriting from llm.Model (llm/models.py)
Implement execute() and execute_stream() methods to call your LLM API (llm/models.py)
Add a register_models() hook function in your plugin file (llm/hookspecs.py)
Use pm.hook.register_models(models=[MyModel(...)]) to register your model (llm/plugins.py)
Install as a pip package with entry point 'llm' in pyproject.toml or install from local directory (pyproject.toml)

Add a new Tool for function calling

Create a class inheriting from llm.Tool in your plugin (llm/tools.py)
Implement execute() method returning the tool result as a string (llm/tools.py)
Define name, description, and input_schema (JSON Schema dict or Pydantic BaseModel) (llm/tools.py)
Register via pm.hook.register_tools(tools=[MyTool()]) in your plugin's register_tools() hook (llm/hookspecs.py)

Add a new CLI command

Define a click command group or standalone command function in llm/cli.py (llm/cli.py)
Use @click decorators for arguments and options (llm/cli.py)
Integrate with plugins via pm.hook calls to get models, tools, or templates as needed (llm/plugins.py)

Add a new Embedding model

Create a class inheriting from llm.Embedding in your plugin (llm/embeddings.py)
Implement embed() method to return a list of floats for a text input (llm/embeddings.py)
Register via pm.hook.register_embeddings() in your plugin (llm/hookspecs.py)
Optionally implement EmbeddingStore subclass for custom vector storage backends (llm/embeddings.py)

🔧Why these technologies

Click (CLI framework) — Provides command-line argument parsing, help generation, and nested command groups for complex CLI
Pluggy (plugin system) — Lightweight, spec-driven plugin architecture enabling third-party model/tool providers without modifying core code
Pydantic (data validation) — Used implicitly in tool schema definitions and optional in serialization for type-safe configuration
SQLite (persistence) — Stores chat history, keys, and embeddings metadata with minimal dependencies; supports local-first workflows
Jinja2 (templating) — Enables reusable prompt templates with variable substitution and conditional logic

⚖️Trade-offs already made

Single-threaded synchronous + asyncio-based streaming
- Why: Simpler mental model for CLI interactions; asyncio handles I/O-bound API calls efficiently
- Consequence: Streaming responses block REPL input; parallel multi-model inference requires separate invocations
Local SQLite database for persistence (no server required)
- Why: Zero operational overhead and respects user privacy by default
- Consequence: Does not support multi-user/multi-device sync; chat history is machine-local
Plugin system via entry_points rather than dynamic discovery
- Why: Clear dependency management and pip integration; safer than arbitrary module scanning
- Consequence: Requires pip install to register plugins; cannot dynamically load from directory
Tool function calling via structured JSON schema
- Why: Model-agnostic representation; supports any LLM with function calling capability
- Consequence: Requires manual JSON schema definition or Pydantic model (additional boilerplate vs. decorator-based approaches)

🚫Non-goals (don't propose these)

Not a GUI or web dashboard—CLI-only interface by design
Not a multi-user SaaS platform—local single-machine architecture
Not a real-time collaboration tool—no live syncing between devices
Not model training—inference-only tool
Not a framework for building chatbots—focuses on direct LLM access and prompting
Not cloud-native—intentionally offline-first with optional cloud model APIs

🪤Traps & gotchas

The README.md is generated from docs/index.md using cogapp and sphinx-markdown-builder (see cog block in README snippet) — editing README directly will be overwritten. The plugin system uses dynamic entry point discovery, so plugins must declare entry points in their pyproject.toml. The SQLite logging feature requires filesystem access and assumes a local database path (docs/logging.md). Tool execution (docs/tools.md) depends on model provider support; not all providers implement tool-use.

🏗️Architecture

💡Concepts to learn

Plugin Hook System — LLM's entire extensibility model (models, embeddings, tools, storage) is built on plugin hooks; you must understand this to add new providers or capabilities
SQLite as Primary Data Store — LLM uses SQLite for conversation logging and embeddings storage; understanding SQLite schema and queries is essential for persistence features and debugging
Entry Points (Python packaging) — Plugins are discovered and loaded via Python entry points (setuptools/pyproject.toml); critical to understand how plugin registration works
Token Counting and Context Windows — Different models have different token limits and pricing; LLM abstracts token counting per-provider to manage prompt/response sizes
Streaming vs. Buffered Responses — LLM supports both streaming and full-buffered model responses; architecture must handle both patterns efficiently in CLI and library contexts
Embedding Models and Vector Storage — LLM's embeddings subsystem (docs/embeddings/storage.md) handles model inference and vector storage backends; separable from chat completions
JSON Schema for Structured Output — LLM's schemas feature (docs/schemas.md) uses JSON Schema to constrain model output for structured extraction; requires understanding schema validation

anthropics/anthropic-sdk-python — Official SDK for Anthropic Claude that LLM wraps; useful to understand the underlying provider API
openai/openai-python — Official OpenAI SDK that LLM integrates with; reference implementation for model provider integration patterns
datasette/datasette — Sibling project by same maintainer; SQLite-based data exploration tool that shares similar philosophy of CLI+Python API duality
jsfiddle/jsfiddle — No — use instead: cortexproject/cortex for distributed LLM inference patterns, or vLLM repo for local model serving infrastructure
ollama/ollama — Complementary local LLM serving tool; LLM can be configured to call Ollama-served models via provider plugins

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive tests for llm/embeddings.py and llm/embeddings_migrations.py

The embeddings module is a core feature (with dedicated docs in docs/embeddings/) but there's no visible test coverage in the file structure. Given the complexity of storage, migrations, and plugin integration for embeddings, this module needs unit and integration tests to prevent regressions.

[ ] Create tests/test_embeddings.py with tests for embedding model registration and inference
[ ] Create tests/test_embeddings_migrations.py with migration-specific tests
[ ] Add tests for embeddings storage backends and querying
[ ] Verify test coverage with pytest and add to CI if not already present

Add integration tests for plugin hooks in tests/test_plugin_hooks.py

The repo has extensive plugin architecture (llm/hookspecs.py, llm/plugins.py) and plugin documentation (docs/plugins/plugin-hooks.md), but no dedicated test file for verifying that all hookspecs work correctly. This ensures plugins can reliably extend the system.

[ ] Create tests/test_plugin_hooks.py to test each hook in llm/hookspecs.py
[ ] Add tests for plugin registration, discovery, and lifecycle
[ ] Test the default plugins (llm/default_plugins/default_tools.py, openai_models.py) against their hooks
[ ] Add tests for edge cases like missing hooks and plugin conflicts

Add docstring generation tests and validation to ensure docs/python-api.md stays in sync

The repository has docs/python-api.md documenting the Python API, but there's no automated way to ensure that code docstrings in llm/init.py and llm/models.py match the documentation. This prevents API documentation drift.

[ ] Create tests/test_api_docs.py that validates public APIs in llm/init.py have complete docstrings
[ ] Add tests to ensure all public Model, Tool, and Embedding classes have docstrings matching docs/python-api.md patterns
[ ] Create a CI check that warns if docstrings are missing or outdated
[ ] Consider integrating with sphinx autodoc to auto-validate documentation completeness

🌿Good first issues

Add missing tests for the embeddings storage backends (docs/embeddings/storage.md describes multiple backends but test coverage likely incomplete for each one)
Document the exact plugin entry point structure with a working example in docs/plugins/ (tutorial-model-plugin.md exists but could be more explicit about pyproject.toml configuration required)
Add type hints to llm/cli.py and llm/init.py to improve IDE support and catch parameter errors earlier (754KB of Python suggests partial typing)

⭐Top contributors

Click to expand

@simonw — 93 commits
@github-actions[bot] — 3 commits
@claude — 1 commits
@eedeebee — 1 commits
@ar-jan — 1 commits

📝Recent commits

Click to expand

3d0321f — Add options= dict parameter to .prompt() and .reply() (#1432) (simonw)
9a5c24e — Release 0.32a1 (simonw)
4d92df1 — Rebuild prompt.messages chain when loading logged conversations (simonw)
cce6ed9 — Ran cog (simonw)
35c35da — Release 0.32a0 (simonw)
31a4ea7 — prompt(messages=[]) and response.stream_events() refactor (simonw)
926394a — Tweaked some overly-promotional language (simonw)
838d557 — Test to_dict() does not emit keys absent from the TypedDict (simonw)
02c9af0 — A bunch of documentation edits (simonw)
5789bc9 — It's actually the 0.32 alpha (simonw)

🔒Security observations

The codebase demonstrates reasonable security practices with no critical vulnerabilities identified. Main concerns are outdated dependency versions in documentation requirements and lack of version pinning for some packages. The subprocess usage in documentation generation is properly configured but could be hardened. Recommend updating dependencies to latest stable versions and adding version constraints to unpinned packages.

Medium · Subprocess execution with check=True in README generation — README.md (cog block - likely llm/cli.py or docs generation). The README.md file uses subprocess.run() to execute sphinx-build command. While check=True is properly used, the command construction doesn't show input validation. If the docs directory path could be influenced by user input, this could be a command injection vector. Fix: Ensure all path components are validated and sanitized. Use absolute paths or validate relative paths. Consider using subprocess with shell=False (which is already the default) and passing arguments as a list rather than a string.
Low · Outdated Sphinx dependency — docs/requirements.txt - sphinx==7.2.6. sphinx==7.2.6 is pinned to a specific older version (from 2023). While pinning versions is good practice, this specific version may contain known vulnerabilities. The package was released in September 2023 and newer versions exist. Fix: Review the changelog for sphinx versions between 7.2.6 and current stable release. Update to the latest stable version if security patches are available. Consider using ~=7.2 for minor version flexibility while maintaining compatibility.
Low · Outdated theme dependency — docs/requirements.txt - furo==2023.9.10. furo==2023.9.10 is pinned to a specific older version from September 2023. No known critical vulnerabilities identified, but the package may be outdated. Fix: Update to the latest stable version of furo. Check release notes for security improvements and compatibility updates.
Low · Missing dependency pinning in docs/requirements.txt — docs/requirements.txt - unpinned packages. Several dependencies lack version pinning: sphinx-autobuild, sphinx-copybutton, and cogapp. This could lead to unexpected behavior or security issues if incompatible or vulnerable versions are installed. Fix: Pin all dependencies to specific versions. Run 'pip freeze' to identify current compatible versions and add them to requirements.txt (e.g., sphinx-autobuild==2023.3.23).
Low · Code generation tool (cogapp) usage — README.md header comment, docs structure. The project uses cogapp (cog) to generate README.md from documentation. Code generation tools can be a vector for injection if the source content is not properly sanitized. While this appears to be legitimate documentation generation, ensure documentation inputs are validated. Fix: Verify that all content fed into the cog generator is sanitized and validated. Keep cogapp updated. Ensure git hooks prevent committing unvalidated generated content.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

simonw/llm

Embed the "Healthy" badge

Onboarding doc

Onboarding: simonw/llm

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🛠️How to make changes

Add a new Model implementation

Add a new Tool for function calling

Add a new CLI command

Add a new Embedding model

🔧Why these technologies

⚖️Trade-offs already made

🚫Non-goals (don't propose these)

🪤Traps & gotchas

🏗️Architecture

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add comprehensive tests for llm/embeddings.py and llm/embeddings_migrations.py

Add integration tests for plugin hooks in tests/test_plugin_hooks.py

Add docstring generation tests and validation to ensure docs/python-api.md stays in sync

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next