RepoPilotOpen in app →

nilsherzig/LLocalSearch

LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progress of the agents and the final answer. No OpenAI or Google API keys are needed.

Mixed

Single-maintainer risk — review before adopting

weakest axis
Use as dependencyMixed

top contributor handles 97% of recent commits; no CI workflows detected

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

  • Last commit 6w ago
  • 4 active contributors
  • Apache-2.0 licensed
Show all 7 evidence items →
  • Tests present
  • Small team — 4 contributors active in recent commits
  • Single-maintainer risk — top contributor 97% of recent commits
  • No CI workflows detected
What would change the summary?
  • Use as dependency MixedHealthy if: diversify commit ownership (top <90%)

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Forkable" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Forkable
[![RepoPilot: Forkable](https://repopilot.app/api/badge/nilsherzig/llocalsearch?axis=fork)](https://repopilot.app/r/nilsherzig/llocalsearch)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/nilsherzig/llocalsearch on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: nilsherzig/LLocalSearch

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/nilsherzig/LLocalSearch shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

WAIT — Single-maintainer risk — review before adopting

  • Last commit 6w ago
  • 4 active contributors
  • Apache-2.0 licensed
  • Tests present
  • ⚠ Small team — 4 contributors active in recent commits
  • ⚠ Single-maintainer risk — top contributor 97% of recent commits
  • ⚠ No CI workflows detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live nilsherzig/LLocalSearch repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/nilsherzig/LLocalSearch.

What it runs against: a local clone of nilsherzig/LLocalSearch — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in nilsherzig/LLocalSearch | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | Last commit ≤ 74 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>nilsherzig/LLocalSearch</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of nilsherzig/LLocalSearch. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/nilsherzig/LLocalSearch.git
#   cd LLocalSearch
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of nilsherzig/LLocalSearch and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "nilsherzig/LLocalSearch(\\.git)?\\b" \\
  && ok "origin remote is nilsherzig/LLocalSearch" \\
  || miss "origin remote is not nilsherzig/LLocalSearch (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 74 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~44d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/nilsherzig/LLocalSearch"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

TL;DR

LLocalSearch is a locally-running search aggregator that uses LLM agents (via Ollama or other local LLM backends) to answer questions by recursively choosing from tools like web search, web scraping, and vector DB lookups—no OpenAI/Google API keys required. It chains multiple LLM calls through backend/agentChain.go and backend/lschains/ to synthesize current web information into final answers, displayed with live agent progress logs in a Svelte frontend. Monorepo split: backend/ (Go) contains the LLM agent orchestration (agentChain, llm_tools, lschains utilities), backend/main.go serves an API on :8080, and the frontend lives at the root level (Svelte + TypeScript in src/, ESLint/Prettier configs). Docker support via Dockerfile, docker-compose.yaml for full local stack. Utils in backend/utils/ handle LLM backend selection, vector DB, and prompt templates.

👥Who it's for

Privacy-conscious developers and researchers who want to run a local search-augmented LLM without cloud API dependencies; users frustrated with opaque ranking/manipulation in commercial search (per the README's OpenAI media-bias critique). Contributors should understand Go agents, LangChain patterns, and local LLM orchestration.

🌱Maturity & risk

Experimental and dormant—the README warns 'this version has not been under development for over a year' and announces a private-beta rewrite. The codebase has no visible CI setup (no .github/workflows/), minimal test coverage (only backend/e2e/simple_question_test.go), and the main developer is shifting focus. Not production-ready without significant stabilization.

High risk: single maintainer (nilsherzig) with no active commits visible in metadata; forked/pinned langchaingo fork (replace github.com/tmc/langchaingo => github.com/nilsherzig/langchaingo v1.99.99) introduces maintenance burden; heavy reliance on Ollama/external LLM setup introduces operational complexity; test coverage is minimal (only one e2e test file visible); Llama3 stop-word handling remains patched on experiments branch, not merged.

Active areas of work

The project is in stasis awaiting a rewrite/relaunch in private beta. Recent activity appears dormant; the roadmap mentions Llama3 support (patch pending langchaingo maintainer feedback), interface overhaul (Obsidian-inspired), and chat history support, but no active branch updates are visible. Developer is gathering feedback privately before a public relaunch.

🚀Get running

git clone https://github.com/nilsherzig/LLocalSearch.git
cd LLocalSearch
cp env-example .env
make  # Or: docker-compose -f docker-compose.dev.yaml up

Note: requires Ollama running locally (see OLLAMA_GUIDE.md), Go 1.22.0+, and Node.js for frontend dev.

Daily commands: Backend: cd backend && go run main.go (runs on :8080, requires .env with LLM config). Frontend: npm install && npm run dev (Svelte dev server on :5173, per package.json). Full stack: docker-compose -f docker-compose.dev.yaml up (see docker-compose.dev.yaml for Ollama + backend + frontend orchestration).

🗺️Map of the codebase

  • backend/agentChain.go: Core agent loop that orchestrates LLM calls, tool selection, and recursive refinement—the heart of the search aggregation logic.
  • backend/lschains/ollama_functioncall.go: Integrates structured function-calling (tool invocation) with Ollama; critical for tool selection in the agent loop.
  • backend/llm_tools/simple_websearch.go: Example tool showing how agents invoke external search; template for adding new tools.
  • backend/utils/llm_backends.go: Abstraction layer for swapping LLM backends (Ollama, custom servers); key for extensibility.
  • backend/apiServer.go: WebSocket/HTTP API bridging backend agent chain to frontend; handles real-time agent progress streaming.
  • backend/utils/prompts.go: Central prompt templates for agent instructions, tool descriptions, and output formatting—easy lever for behavior tuning.
  • Dockerfile: Production Docker image with Go runtime; needed for containerized deployment.
  • docker-compose.dev.yaml: Dev environment orchestration (Ollama, backend, frontend, hot-reload); quickest local setup.

🛠️How to make changes

New search tool: add a file to backend/llm_tools/ (mirror simple_websearch.go structure). Agent logic: edit backend/agentChain.go to change tool selection loop. Prompts: modify backend/utils/prompts.go. LLM backend support: extend backend/utils/llm_backends.go. Frontend UI: edit .svelte files in src/ (Svelte component-driven, TypeScript type-safe). Vector DB: extend backend/utils/vector_db_handler.go.

🪤Traps & gotchas

Ollama dependency: requires Ollama running separately on localhost:11434 (or custom OLLAMA_URL in .env); no built-in fallback. LangChain fork: uses custom langchaingo fork pinned to v1.99.99; upstream updates require manual merge. Llama3 stop-words: baseline llama3 support is broken (per README); experiments branch has a patch but no CI validates it. Vector DB init: backend/utils/load_localfiles.go and vector DB handler expect pre-seeded embeddings; no auto-setup visible. Go 1.22 constraint: older toolchains will fail; uses go.work workspace (go 1.18+ feature). No environment docs: .env setup is minimal; LLM model selection, search API keys (if needed), and vector DB path are not documented in-repo.

💡Concepts to learn

  • Agent Loop (Tool Use in LLMs) — The core pattern in backend/agentChain.go—LLM generates structured tool calls, system executes them, feeds results back; understanding this loop is critical to modifying how agents reason and search.
  • Chain-of-Thought Prompting — LLocalSearch uses detailed system prompts (in backend/utils/prompts.go) to guide step-by-step reasoning; knowing how prompt structure affects LLM behavior is essential for tuning answers.
  • Retrieval-Augmented Generation (RAG) — The vector DB tool (backend/llm_tools/tool_search_vector_db.go) implements RAG; understanding embedding-based retrieval vs. keyword search informs how the agent sources information.
  • Function Calling / Structured Output from LLMs — LLocalSearch relies on backend/lschains/ollama_functioncall.go to parse tool invocations from LLM text; understanding JSON/schema-based LLM output is crucial for adding new tools.
  • WebSocket Real-Time Streaming — The frontend receives live agent progress updates via WebSocket (see backend/apiServer.go); understanding the event stream protocol is needed to modify progress UI or add new logging.
  • Vector Embeddings & Semantic Search — The vector DB tool uses embeddings to find semantically similar documents; knowing embedding space and similarity metrics (cosine, etc.) helps debug retrieval quality.
  • Stop Sequences / Token Control in LLMs — The README mentions Llama3 stop-word handling issues (patch on experiments branch); understanding stop tokens prevents LLM hallucination and controls when tool calls terminate.
  • tmc/langchaingo — The official LangChain-Go library that LLocalSearch wraps and forks; understanding its agent/tool abstractions is essential to extending this codebase.
  • ollama/ollama — The local LLM runtime that powers LLocalSearch's inference; essential dependency to run and test the project locally.
  • jina-ai/reader — Alternative local web-scraping + LLM tool that solves similar 'augment local LLM with live data' problem; shares philosophy of API-free operation.
  • mistralai/mistral-src — Mistral LLM codebase; relevant if extending LLocalSearch to support Mistral models or fine-tuning agents for local inference.
  • run-llama/llama_index — Competitor RAG/indexing framework (Python-based); LLocalSearch uses vector DB lookups similarly; useful reference for vector search patterns.

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive integration tests for LLM tool chain execution

The repo has only one e2e test (backend/e2e/simple_question_test.go) that tests basic question flow. Given the complexity of agent chains and tool calling (agentChain.go, lschains/ollama_functioncall.go, llm_tools/*), there should be tests for: error handling when tools fail, multiple sequential tool calls, vector DB search integration, and web scraping edge cases. This ensures reliability as new contributors add features.

  • [ ] Create backend/e2e/agent_chain_test.go with tests for multi-step tool execution
  • [ ] Add test for backend/lschains/ollama_functioncall.go parsing edge cases and malformed LLM responses
  • [ ] Add backend/e2e/tool_integration_test.go testing tool_websearch.go, tool_webscrape.go, and tool_search_vector_db.go with mock data
  • [ ] Update Makefile test target to run all e2e tests and report coverage

Add GitHub Actions CI/CD workflow for Go backend testing and linting

The .github/ISSUE_TEMPLATE folder exists but no workflows are present. The backend (go.mod, lschains/, llm_tools/) needs automated testing on push/PR to prevent regressions. Current setup has .air.toml for local dev but no CI enforcement of code quality or test passage.

  • [ ] Create .github/workflows/backend-test.yml that runs 'go test ./...' and 'go vet ./...' on all backend code
  • [ ] Add golangci-lint step to check for code quality issues in backend/
  • [ ] Ensure workflow tests against go 1.22.0 and 1.22.1 (per go.mod toolchain specification)
  • [ ] Add workflow badge to README.md to signal build status

Create backend/llm_tools/README.md documenting tool interface and add tests for tool registration

The llm_tools/ directory contains three tool implementations (simple_websearch.go, tool_webscrape.go, tool_search_vector_db.go) but there's no specification document for how new tools should be structured or registered. Contributors adding new tools will have to reverse-engineer the pattern. Additionally, there are no unit tests for tool initialization and parameter validation.

  • [ ] Create backend/llm_tools/README.md documenting the Tool interface, required methods, and step-by-step guide for adding new tools
  • [ ] Add backend/llm_tools/tool_interface_test.go with unit tests for all three existing tools verifying they implement required signatures
  • [ ] Add parameter validation tests to ensure each tool correctly validates input before execution
  • [ ] Document the tool registration flow in agentChain.go with inline comments and reference in README.md

🌿Good first issues

  • Add unit tests for backend/llm_tools/simple_websearch.go and backend/llm_tools/tool_webscrape.go (currently only one e2e test exists; these tools need isolated mocking). Start in backend/llm_tools/*_test.go following Ginkgo/Gomega patterns in backend/e2e/.
  • Document the .env configuration file: create ENV_SETUP.md explaining OLLAMA_URL, MODEL_NAME, VECTOR_DB_PATH, and LLM backend options; reference backend/utils/llm_backends.go to list all supported backends and their config keys.
  • Add a simple integration test for the vector DB search tool (backend/llm_tools/tool_search_vector_db.go): mock a vector DB with 3-5 sample documents, verify the tool returns ranked results. Use the pattern from backend/e2e/simple_question_test.go and Ginkgo's Describe/It blocks.

Top contributors

Click to expand

📝Recent commits

Click to expand
  • acda048 — Update FUNDING.yml (nilsherzig)
  • 6ef2739 — added warning (nilsherzig)
  • 8a7cf44 — cleanup (nilsherzig)
  • a7fd329 — only hash ip not port (nilsherzig)
  • eaa48ae — fix mime types in nginx (nilsherzig)
  • de4cd30 — some nginx buffering things (nilsherzig)
  • 7ab278c — makefile: fix (nilsherzig)
  • 65b422e — using static site and nginx (nilsherzig)
  • 52dc9e5 — more static testing (nilsherzig)
  • 4329a9d — custom frontend server (nilsherzig)

🔒Security observations

  • High · Outdated Go Dependencies with Known Vulnerabilities — backend/go.mod, backend/go.sum. The go.mod file specifies golang.org/x/tools v0.17.0 and other dependencies that may contain known security vulnerabilities. The project uses Go 1.22.0/1.22.1, and several transitive dependencies (klauspost/compress v1.17.2, github.com/dlclark/regexp2, etc.) are pinned to older versions that may have unpatched CVEs. Fix: Run 'go get -u ./...' to update dependencies, then review and test. Run 'govulncheck ./...' to identify known CVEs. Pin versions to latest patch releases that have security fixes.
  • High · Missing Input Validation in LLM Tools — backend/llm_tools/simple_websearch.go, backend/llm_tools/tool_webscrape.go. Files like backend/llm_tools/simple_websearch.go and backend/llm_tools/tool_webscrape.go likely perform web scraping and search operations based on LLM agent decisions. Without proper input sanitization, malicious LLM outputs could lead to SSRF attacks, command injection, or requests to internal services. Fix: Implement URL validation and whitelisting. Validate and sanitize all URLs before making HTTP requests. Use URL parsing libraries to prevent SSRF. Implement timeout and rate limiting on web requests.
  • High · Potential XSS Vulnerability in Frontend — src/lib/log_item.svelte, src/lib/sources.svelte, src/lib/log_node.svelte. Svelte components (src/lib/log_item.svelte, src/lib/sources.svelte) may render user or LLM-generated content without proper escaping. If logs or sources contain unsanitized HTML/JavaScript, XSS attacks are possible. Fix: Ensure all dynamic content is escaped using Svelte's built-in escaping (avoid {@html} unless absolutely necessary with sanitized input). Use DOMPurify or similar library if HTML rendering is required. Implement Content Security Policy headers.
  • High · Unrestricted Backend Service Exposure — docker-compose.yaml, backend/apiServer.go. The backend service in docker-compose.yaml has no exposed ports restriction or authentication mechanism visible. The backend communicates with OLLAMA_HOST, CHROMA_DB_URL, and SEARXNG_DOMAIN without apparent authentication or TLS verification. Fix: Add authentication (API keys, OAuth) to the backend API. Use TLS/HTTPS for all external service communications. Implement rate limiting and request validation. Do not expose backend directly; use nginx as reverse proxy with authentication.
  • High · Missing CORS and Security Headers Configuration — nginx.conf. The nginx.conf configuration is present but not shown in detail. Without proper CORS restrictions, security headers (CSP, X-Frame-Options, etc.), the frontend could be vulnerable to cross-origin attacks and clickjacking. Fix: Configure strict CORS policies allowing only trusted origins. Add security headers: Content-Security-Policy, X-Frame-Options: DENY, X-Content-Type-Options: nosniff, Strict-Transport-Security.
  • Medium · Unencrypted Service-to-Service Communication — docker-compose.yaml. Docker services (backend, chromadb, searxng) communicate over internal network without enforced TLS. If any service is compromised, other services' traffic could be intercepted. Fix: Implement TLS/SSL for inter-service communication. Use Docker secrets for sensitive credentials. Consider using service mesh (Istio) for production deployments.
  • Medium · Redis Running Without Authentication — docker-compose.yaml (redis service). The Redis service in docker-compose.yaml has no requirepass or authentication configured, only basic save and loglevel settings. If Redis is exposed, it can be accessed without credentials. Fix: Set 'requirepass' in Redis configuration or use Redis ACLs. Use environment variables for passwords. Bind Redis to localhost only. Consider using redis-stack or managed Redis with built-in authentication.
  • Medium · Potential SQL/NoSQL Injection via Vector DB — undefined. Fix: undefined

LLM-derived; treat as a starting point, not a security audit.


Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Mixed signals · nilsherzig/LLocalSearch — RepoPilot