GaiZhenbiao/ChuanhuChatGPT
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
Mixed signals — read the receipts
weakest axiscopyleft license (GPL-3.0) — review compatibility; no tests detected
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit 1w ago
- ✓12 active contributors
- ✓GPL-3.0 licensed
Show all 7 evidence items →Show less
- ✓CI configured
- ⚠Concentrated ownership — top contributor handles 58% of recent commits
- ⚠GPL-3.0 is copyleft — check downstream compatibility
- ⚠No test directory detected
What would change the summary?
- →Use as dependency Concerns → Mixed if: relicense under MIT/Apache-2.0 (rare for established libs)
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Forkable" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/gaizhenbiao/chuanhuchatgpt)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/gaizhenbiao/chuanhuchatgpt on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: GaiZhenbiao/ChuanhuChatGPT
Generated by RepoPilot · 2026-05-07 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/GaiZhenbiao/ChuanhuChatGPT shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
WAIT — Mixed signals — read the receipts
- Last commit 1w ago
- 12 active contributors
- GPL-3.0 licensed
- CI configured
- ⚠ Concentrated ownership — top contributor handles 58% of recent commits
- ⚠ GPL-3.0 is copyleft — check downstream compatibility
- ⚠ No test directory detected
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live GaiZhenbiao/ChuanhuChatGPT
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/GaiZhenbiao/ChuanhuChatGPT.
What it runs against: a local clone of GaiZhenbiao/ChuanhuChatGPT — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in GaiZhenbiao/ChuanhuChatGPT | Confirms the artifact applies here, not a fork |
| 2 | License is still GPL-3.0 | Catches relicense before you depend on it |
| 3 | Default branch main exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 37 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of GaiZhenbiao/ChuanhuChatGPT. If you don't
# have one yet, run these first:
#
# git clone https://github.com/GaiZhenbiao/ChuanhuChatGPT.git
# cd ChuanhuChatGPT
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of GaiZhenbiao/ChuanhuChatGPT and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "GaiZhenbiao/ChuanhuChatGPT(\\.git)?\\b" \\
&& ok "origin remote is GaiZhenbiao/ChuanhuChatGPT" \\
|| miss "origin remote is not GaiZhenbiao/ChuanhuChatGPT (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(GPL-3\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"GPL-3\\.0\"" package.json 2>/dev/null) \\
&& ok "license is GPL-3.0" \\
|| miss "license drift — was GPL-3.0 at generation time"
# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
&& ok "default branch main exists" \\
|| miss "default branch main no longer exists"
# 4. Critical files exist
test -f "ChuanhuChatbot.py" \\
&& ok "ChuanhuChatbot.py" \\
|| miss "missing critical file: ChuanhuChatbot.py"
test -f "modules/webui.py" \\
&& ok "modules/webui.py" \\
|| miss "missing critical file: modules/webui.py"
test -f "modules/models/base_model.py" \\
&& ok "modules/models/base_model.py" \\
|| miss "missing critical file: modules/models/base_model.py"
test -f "modules/models/models.py" \\
&& ok "modules/models/models.py" \\
|| miss "missing critical file: modules/models/models.py"
test -f "modules/config.py" \\
&& ok "modules/config.py" \\
|| miss "missing critical file: modules/config.py"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 37 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~7d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/GaiZhenbiao/ChuanhuChatGPT"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
ChuanhuChat is a Gradio-based web UI that provides a unified interface for 30+ LLMs (ChatGPT, Claude, Gemini, DeepSeek, local models via Ollama, etc.) with enterprise features including file-based RAG, web search integration, agent orchestration via LangChain, GPT fine-tuning, and vision capabilities. It solves the fragmentation problem of managing multiple LLM APIs and local deployments through a single polished interface. Monolithic Gradio application: ChuanhuChatbot.py is the entry point orchestrating UI via Gradio components. Model implementations live in modules/models/ with a base class (base_model.py) for unified API abstraction; 25+ provider-specific classes inherit from it (OpenAI, Claude, Ollama, etc.). Core utilities in modules/ (config.py, index_func.py). Localization JSON files in locale/ for i18n. UI assets and styling handled by Gradio's built-in theming plus custom CSS in top-level structure.
👥Who it's for
Data scientists and AI researchers who need to experiment with multiple LLM backends simultaneously; DevOps engineers deploying LLM services in production; non-technical users wanting a ChatGPT-like interface for locally-hosted models without writing API code. Contributors are primarily Chinese developers and the single maintainer GaiZhenbiao.
🌱Maturity & risk
Actively developed with 5.0 major release including UI overhaul, PWA support, and mobile optimization. No obvious test suite in file structure, but CI/CD pipelines exist (.github/workflows/ for Docker builds and releases). The ~600k lines of Python codebase and multi-language support (8 locales) suggest production-grade maturity, though testing infrastructure appears minimal. Verdict: production-ready but with testing gaps.
High dependency count (40+ packages including fragile ones like faiss-cpu==1.7.4, pinned protobuf==3.20.3, and version-locked openai==1.16.2) creates upgrade friction and potential CVE exposure. Single maintainer creates bus-factor risk. No visible CI test runs in workflows (only Docker builds), raising confidence concerns about breaking changes. LLM API dependencies (OpenAI, Anthropic, Google) mean service outages cascade directly to users.
Active areas of work
Version 5.0 major release active: new frosted-glass UI, mobile/notch support, left sidebar history with search/regex, PWA installability, auto-naming via LLM, and GPT-3.5 fine-tuning. Workflows indicate ongoing Docker image optimization (Build_Docker.yml, Release_docker.yml). Repository supports latest models (GPT-5, DeepSeek R1, Claude 3.5) suggesting continuous model provider integration.
🚀Get running
git clone https://github.com/GaiZhenbiao/ChuanhuChatGPT.git
cd ChuanhuChatGPT
pip install -r requirements.txt # infer from dependencies snippet
cp config_example.json config.json
# Edit config.json with your API keys (OpenAI, Anthropic, etc.)
python ChuanhuChatbot.py
# Open http://localhost:7860 in browser
Daily commands:
python ChuanhuChatbot.py
Gradio automatically launches on http://localhost:7860. For Docker: docker build -t chuanhu-chat . && docker run -p 7860:7860 chuanhu-chat (see Dockerfile). Environment variables required: API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) in config.json or .env.
🗺️Map of the codebase
ChuanhuChatbot.py— Main entry point that initializes the Gradio web UI and orchestrates the entire application lifecyclemodules/webui.py— Core UI layer that builds all Gradio interface components, chat tabs, and file handling featuresmodules/models/base_model.py— Abstract base class defining the interface all 30+ LLM implementations must followmodules/models/models.py— Model registry and factory that instantiates the correct LLM provider based on user selectionmodules/config.py— Configuration loader for API keys, model settings, and application parameters from config.jsonmodules/presets.py— System prompts, default settings, and conversation templates used across all chat sessionsrequirements.txt— Declares 40+ dependencies including Gradio, LangChain, OpenAI, and specialized LLM SDKs
🛠️How to make changes
Add a New LLM Provider
- Create new model class in modules/models/ inheriting from base_model.BaseModel (
modules/models/NewProvider.py) - Implement required methods: chat(), completion(), get_answer(), count_tokens() (
modules/models/NewProvider.py) - Register in the model factory by adding entry to MODEL_MAP in models.py (
modules/models/models.py) - Add provider UI selector and config fields in webui.py (model dropdown and API key input) (
modules/webui.py) - Add example API key placeholder in config_example.json (
config_example.json)
Add a New Chat Tab / Feature
- Create a new Gradio Tab() block in the main interface layout function (
modules/webui.py) - Implement tab-specific callback functions for submit/upload/clear buttons (
modules/webui.py) - If feature needs backend logic (like RAG), add utility functions in modules/ (
modules/index_func.py or modules/pdf_func.py) - Add localization strings for tab name and buttons in locale JSON files (
locale/en_US.json)
Add Web Search / External Integration
- Add new provider class inheriting from base_model.BaseModel if it's a new model, or extend ChuanhuAgent (
modules/models/ChuanhuAgent.py) - Integrate third-party library (e.g. duckduckgo-search, google-search-results) in utils or agent (
modules/utils.py) - Add toggle/checkbox in webui.py for the feature with corresponding config key (
modules/webui.py) - Update presets.py with any new system prompts mentioning the integration capability (
modules/presets.py)
Customize UI / Add JavaScript Behavior
- Add custom HTML template in web_assets/html/ (
web_assets/html/custom_feature.html) - Create or update JavaScript in web_assets/javascript/ for interactivity (
web_assets/javascript/custom_feature.js) - Include JavaScript in webui.py using gr.HTML() or HTML() blocks (
modules/webui.py) - Add CSS styling in appearance_switcher.html or via Gradio theme customization (
web_assets/html/appearance_switcher.html)
🔧Why these technologies
- Gradio 4.29.0 — Provides rapid web UI scaffolding with built-in chat components, file upload, and Blocks API for complex layouts; removes need for custom Flask/React frontend
- LangChain 0.1.14 + LangChain-OpenAI — Unified interface for 30+ LLM providers, prompt templating, memory management, and agent orchestration; reduces per-provider boilerplate
- FAISS (faiss-cpu 1.7.4) — CPU-efficient vector similarity search for RAG (retrieval-augmented generation) without GPU dependency; enables document QA
- OpenAI SDK 1.16.2 + Anthropic 0.18.1 — Official SDKs for popular closed-source LLMs; maintains API contract stability and native streaming support
- FastAPI 0.112.4 — Lightweight async web framework underlying Gradio for custom API extensions and multi-part file uploads
- PyPDF2 + pdfplumber — Dual approach for robust PDF text extraction: PyP
🪤Traps & gotchas
API Key Management: Each LLM provider requires separate API credentials in config.json; missing keys silently disable that provider. Protobuf Pin: protobuf==3.20.3 is strict to avoid breaking changes in Google Cloud dependencies (e.g., google-cloud-aiplatform); upgrading causes import errors. FAISS CPU-only: faiss-cpu==1.7.4 is pinned; GPU variant won't auto-install, requires manual switch. Gradio Version Lock: gradio==4.29.0 is pinned; newer versions may break custom UI assumptions or component APIs. LangChain Deprecation: Using older LangChain 0.1.14; newer versions (0.2+) have breaking API changes requiring module rewrites. No .env Support: Config is purely file-based; Docker deployments need volume mounts or environment variable injection at container start.
🏗️Architecture
💡Concepts to learn
- Retrieval-Augmented Generation (RAG) — Core feature in ChuanhuChat (file-based QA via modules/index_func.py); enables LLMs to answer questions grounded in uploaded documents using FAISS vector search
- Adapter Pattern — Fundamental architecture: modules/models/base_model.py is an adapter that normalizes 25+ heterogeneous LLM APIs (OpenAI, Anthropic, Ollama, etc.) into a unified interface
- Streaming vs Blocking I/O — Each LLM provider implements both _get_answer() (blocking) and _get_stream_answer() methods; critical for responsive UI — streaming used for real-time token display
- Token Counting & Context Windows — ChuanhuChat uses tiktoken library to count prompt tokens and respect model-specific context limits before sending requests; prevents truncation and cost overruns
- Progressive Web App (PWA) — v5.0 feature: ChuanhuChat is installable as PWA via Gradio; enables offline-capable desktop/mobile app experience without native compilation
- FAISS Vector Indexing — Used for semantic search over uploaded documents in RAG pipeline (modules/index_func.py); enables fast approximate nearest-neighbor lookup of document chunks
- Web Search Integration — ChuanhuChat augments LLM responses with real-time search results via DuckDuckGo and Google Search APIs; models can cite current information beyond training cutoff
🔗Related repos
THUDM/ChatGLM-6B— Open-source LLM explicitly supported in ChuanhuChat (modules/models/ChatGLM.py); reference for understanding local model integration patternhwchase17/langchain— Core dependency for agent orchestration and RAG; ChuanhuChat wraps LangChain agents as a key featureopenai/openai-python— Primary API client dependency (openai==1.16.2); essential for ChatGPT integration in modules/models/OpenAIChat.pyanthropics/anthropic-sdk-python— Claude API client (anthropic==0.18.1); one of 30+ supported backends with dedicated module (modules/models/Claude.py)janhq/jan— Competitor UI for multi-LLM management; shares similar goal of unified LLM interface but with different architecture
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add unit tests for modules/models/base_model.py and model loading logic
The repo supports 20+ LLM models (OpenAI, Claude, Groq, Ollama, etc.) but there are no visible test files. base_model.py is the foundation for all model implementations. Adding pytest-based unit tests for model initialization, error handling, and API response parsing would prevent regressions as new models are added and catch compatibility issues early.
- [ ] Create tests/ directory with conftest.py for fixtures
- [ ] Add tests/test_base_model.py covering initialization, method signatures, and error handling
- [ ] Add tests/test_model_loading.py to verify modules/models/models.py correctly loads each model class
- [ ] Add GitHub Action workflow .github/workflows/run-tests.yml to run pytest on PR submissions
- [ ] Update requirements.txt to include pytest and pytest-cov as dev dependencies
Add integration tests for file-based QA pipeline (modules/pdf_func.py and modules/index_func.py)
The repo advertises 'file-based QA' as a core feature but modules/pdf_func.py (PDF handling) and modules/index_func.py (indexing/retrieval) lack test coverage. With langchain, faiss-cpu, and multiple file processors (PyPDF2, pdfplumber, unstructured), adding integration tests would catch breakage in the document pipeline, embeddings, and vector store retrieval.
- [ ] Create tests/test_pdf_processing.py with sample PDFs to verify PyPDF2 and pdfplumber extraction
- [ ] Create tests/test_indexing.py to test faiss index creation, vector storage, and retrieval against langchain_community loaders
- [ ] Add tests/fixtures/ with sample documents (PDF, DOCX, XLSX) for reproducible testing
- [ ] Add GitHub Action to download test fixtures and run integration tests on PR
- [ ] Document expected behavior in tests/README.md for contributors
Add validation and type hints to modules/config.py with Pydantic schema tests
The repo loads configuration from config_example.json and supports multiple LLM providers with different credential requirements. modules/config.py handles this but likely lacks schema validation. With pydantic==2.5.2 already in dependencies, adding strict ConfigModel definitions and pytest-based schema validation would prevent silent failures from invalid API keys, missing fields, or type mismatches.
- [ ] Create Pydantic models in modules/config.py for each LLM provider (OpenAI, Claude, Groq, etc.) as BaseSettings subclasses
- [ ] Add validation logic to reject missing required fields (API keys) and invalid model names
- [ ] Create tests/test_config_validation.py with pytest fixtures for valid/invalid config JSONs
- [ ] Update modules/config.py docstrings with examples of required config structure for each model
- [ ] Add error messages that guide users to config_example.json when validation fails
🌿Good first issues
- Add pytest unit tests for
modules/models/base_model.pyand 2-3 concrete provider implementations (e.g., OpenAIChat, Claude) to catch regression in model abstraction — currently no test suite visible despite 25+ provider classes. - Implement missing localization strings in
locale/files (checklocale/extract_locale.pyfor extraction logic); several UI features added in v5.0 may lack translations for 7 supported languages beyond Chinese. - Add provider-specific example configurations to
config_example.jsonwith inline comments explaining required API keys, rate limits, and model choices for each of the 30+ supported LLMs (currently minimal documentation).
⭐Top contributors
Click to expand
Top contributors
- @GaiZhenbiao — 58 commits
- @Keldos-Li — 24 commits
- @ZhengQinyu — 5 commits
- @BigDataMao — 3 commits
- @octo-patch — 2 commits
📝Recent commits
Click to expand
Recent commits
c020b2d— fix: yield content from final streaming chunk before stopping (#1198) (octo-patch)6de06a5— fix: use configured midjourney_temp_folder instead of hardcoded ./tmp (fixes #1100) (#1197) (octo-patch)6cb3e22— fix: replace bare except clauses with except Exception (#1190) (haosenwang1018)9ce1ae4— Fix AttributeError when hide_history_when_not_logged_in is True (#1188) (K2064083)0271b07— Added GPT-5 support (GaiZhenbiao)550fd86— fix: 完善了检查更新的错误提示 (Keldos-Li)fa2be03— dev: 修改cdn地址为cdnjs (Keldos-Li)466dfb3— 加入DeepSeek支持 (GaiZhenbiao)eece834— 使用最新Gemini模型,支持自定义API host (GaiZhenbiao)3747cdb— Added extra model metadata example in config.json (GaiZhenbiao)
🔒Security observations
- High · Outdated and Vulnerable Dependencies —
requirements.txt, requirements_advanced.txt. Multiple dependencies have known vulnerabilities or are significantly outdated: protobuf==3.20.3 (vulnerable versions exist), faiss-cpu==1.7.4 (outdated), gradio==4.29.0 (may have security patches in newer versions), and langchain==0.1.14 (outdated). These versions are known to have CVEs and security issues. Fix: Update all dependencies to their latest stable versions. Run 'pip list --outdated' and update to versions with security patches. Use tools like 'safety' or 'bandit' to scan for known vulnerabilities: 'pip install safety && safety check'. - High · Insecure Protobuf Version —
requirements.txt - protobuf==3.20.3. protobuf==3.20.3 has known security vulnerabilities (CVE-2024-24786, CVE-2023-5237). This version should not be used in production environments. Fix: Upgrade to protobuf>=4.23.0 or the latest stable 4.x version to patch known vulnerabilities. - High · Hardcoded Configuration Example —
config_example.json. The file 'config_example.json' may contain example API keys, credentials, or sensitive configuration patterns that could be accidentally used in production or leaked. Fix: Ensure config_example.json is in .gitignore. Implement environment variable-based configuration for all sensitive values (API keys, tokens, credentials). Document that users should never commit actual config files with credentials. - Medium · No Environment Variable Validation —
modules/config.py (assumed), ChuanhuChatbot.py. The codebase appears to handle multiple API keys (OpenAI, Anthropic, Google, etc.) but there's no evidence of proper environment variable validation or secure credential management in the file structure shown. Fix: Implement strict environment variable validation using libraries like 'python-dotenv' and 'pydantic'. Never log or print credentials. Use secrets management tools like AWS Secrets Manager or HashiCorp Vault for production. - Medium · FastAPI Application Without Visible Security Headers —
modules/webui.py (assumed), ChuanhuChatbot.py. FastAPI is listed as a dependency (fastapi==0.112.4) but there's no evidence of security middleware configuration (CORS, CSRF protection, security headers) in the visible file structure. Fix: Implement FastAPI middleware for security: use FastAPI's built-in middleware for CORS (with restrictive origins), add HTTPS enforcement, implement rate limiting, and add security headers using starlette middleware. - Medium · Potential LLM Injection Vulnerability —
modules/models/*.py, modules/webui.py. With multiple LLM integrations (OpenAI, Claude, Gemini, etc.) and web search capabilities, user inputs may not be properly sanitized before being sent to external APIs, risking prompt injection attacks. Fix: Implement strict input validation and sanitization. Use parameterized/templated prompts. Never directly concatenate user input into prompts. Consider using libraries like 'langchain' input validators and implement content filtering for user messages. - Medium · Insecure Python Dependency Installation in Docker —
Dockerfile (lines: RUN pip install --user --no-cache-dir). The Dockerfile uses 'pip install --user --no-cache-dir' without pinning exact versions in the requirements files and doesn't verify package integrity with hash verification. Fix: Use pip's hash-checking mode: 'pip install --require-hashes -r requirements.txt'. Implement Docker layer caching optimization and consider using 'pip-audit' to check for known vulnerabilities during build time. - Medium · Arbitrary File Upload/Processing Risk —
modules/pdf_func.py, modules/index_func.py. The modules reference PDF processing (PyPDF2, pdfplumber) and file-based QA capabilities. File upload endpoints may be vulnerable to path traversal, arbitrary file access, or malicious file processing attacks. Fix: Implement strict file type validation (whitelist allowed extensions
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.