RepoPilot

haotian-liu/LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Mixed

Stale — last commit 2y ago

MixedDependency

last commit was 2y ago; no CI workflows detected

HealthyFork & modify

Has a license, tests, and CI — clean foundation to fork and modify.

HealthyLearn from

Documented and popular — useful reference codebase to read through.

MixedDeploy as-is

last commit was 2y ago; no CI workflows detected

  • Stale — last commit 2y ago
  • Concentrated ownership — top contributor handles 79% of recent commits
  • No CI workflows detected
  • 15 active contributors
  • Apache-2.0 licensed
  • Tests present

What would improve this?

  • Use as dependency MixedHealthy if: 1 commit in the last 365 days
  • Deploy as-is MixedHealthy if: 1 commit in the last 180 days

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Forkable" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:
RepoPilot: Forkable
[![RepoPilot: Forkable](https://repopilot.app/api/badge/haotian-liu/llava?axis=fork)](https://repopilot.app/r/haotian-liu/llava)

Paste at the top of your README.md — renders inline like a shields.io badge.

Preview social card

This card auto-renders when someone shares https://repopilot.app/r/haotian-liu/llava on X, Slack, or LinkedIn.

Ask AI about haotian-liu/llava

Grounded in the actual source code. Pick a starter question or write your own.

Or write your own question →

Onboarding doc

Onboarding: haotian-liu/LLaVA

Generated by RepoPilot · 2026-06-21 · Source

🎯Verdict

WAIT — Stale — last commit 2y ago

  • 15 active contributors
  • Apache-2.0 licensed
  • Tests present
  • ⚠ Stale — last commit 2y ago
  • ⚠ Concentrated ownership — top contributor handles 79% of recent commits
  • ⚠ No CI workflows detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

TL;DR

LLaVA is a multimodal large language model that combines vision and language understanding, trained via visual instruction tuning to achieve GPT-4V-level capabilities. It enables end-to-end instruction following on images through a connector layer between a vision encoder (CLIP) and a language model (LLaMA/Qwen), trained on curated visual instruction datasets (LLaVA-Instruct-80K). Modular architecture: llava/ core contains the unified model implementation (llava/model/) with the connector bridging vision and language; evaluation suite in llava/eval/ (model_vqa.py, model_qa.py, eval_gpt_review.py for benchmarking); training data/config in docs/Data.md and docs/Finetune_Custom_Data.md; DevContainer setup in .devcontainer/ for reproducible environments; language/vision features decoupled via llava/conversation.py and llava/constants.py.

👥Who it's for

ML researchers and engineers building production vision-language systems; organizations seeking open-source alternatives to GPT-4V for image understanding tasks; fine-tuners who need to adapt the model to custom vision-language domains using the llava/eval/ evaluation pipelines and docs/Finetune_Custom_Data.md patterns.

🌱Maturity & risk

Highly mature and actively maintained: NeurIPS 2023 Oral publication with 18K+ stars, regular releases (LLaVA-NeXT in May 2024, LMMs-Eval pipeline released), comprehensive CI via .github/ templates, and documented deployment paths (Replicate, HuggingFace Spaces, AutoGen integration). Production-ready with multiple real-world deployments.

Low risk: large academic team backing (UC Berkeley), stable foundational dependencies (transformers, PyTorch), well-documented model weights on HuggingFace. Minor risks: rapid iteration on vision encoders (CLIP → variants) may require dependency updates; evaluation reproducibility depends on external services (GPT-4V for benchmarking in llava/eval/eval_gpt_review.py); breaking changes across v1→v1.5→NeXT versions documented in docs/.

Active areas of work

Active development on LLaVA-NeXT (LLaMA-3, Qwen variants) with video modality support (April 2024); LMMs-Eval framework being pushed for standardized evaluation; ongoing optimization for different hardware (Intel in docs/Intel.md, macOS in docs/macOS.md, Windows in docs/Windows.md); community contributions integrated (llama.cpp, Colab, AutoGen).

🚀Get running

git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA
pip install -e .
python -m llava.serve.cli --model-path liuhaotian/llava-v1.5-7b --image-file <image.jpg>

Or use DevContainer: code --new-window . with DevContainer extension, which runs postCreateCommand.sh to install dependencies.

Daily commands:

# CLI inference
python -m llava.serve.cli --model-path liuhaotian/llava-v1.5-7b

# Batch evaluation (VQA)
python llava/eval/model_vqa.py --model-path liuhaotian/llava-v1.5-7b --question-file <qa.json> --image-folder <images/> --answers-file <output.json>

# Fine-tune on custom data (see docs/Finetune_Custom_Data.md)
python llava/train/train.py --model_id llava-v1.5-7b --data_path <data.json>

🗺️Map of the codebase

  • llava/model/builder.py — Core model initialization and loading logic; entry point for instantiating LLaVA models with vision encoders and language models
  • llava/model/llava_arch.py — Base LLaVA architecture class defining the multimodal integration between vision and language; fundamental to understanding model composition
  • llava/mm_utils.py — Multimodal utilities for image processing and token handling; critical for preparing visual inputs for the model
  • llava/conversation.py — Conversation templating and message formatting; essential for inference API and chat interactions
  • llava/model/multimodal_encoder/clip_encoder.py — CLIP-based vision encoder implementation; handles visual feature extraction from images
  • llava/serve/model_worker.py — Model serving worker that exposes inference endpoints; critical infrastructure for deployment and API access
  • llava/eval/run_llava.py — Inference and evaluation runner; demonstrates how to use the model end-to-end for generation tasks

🛠️How to make changes

Add support for a new language model backbone

  1. Create a new language model wrapper in llava/model/language_model/ following the pattern of llava_llama.py, implementing LlamaForCausalLM subclass with LLaVA-specific forward logic (llava/model/language_model/llava_newmodel.py)
  2. Register the new model in the builder by adding import and conditional logic to instantiate your model class (llava/model/builder.py)
  3. Add model-specific conversation templates if needed (e.g., different prompt formats, chat templates) (llava/conversation.py)
  4. Test end-to-end inference using the eval runner script (llava/eval/run_llava.py)

Add a new vision encoder

  1. Create encoder implementation in llava/model/multimodal_encoder/ (e.g., llava_siglip_encoder.py) subclassing CLIPVisionModel or similar (llava/model/multimodal_encoder/clip_encoder.py)
  2. Update the encoder builder factory to recognize and instantiate your encoder based on model config (llava/model/multimodal_encoder/builder.py)
  3. If image preprocessing differs, update mm_utils.py with new image_processor logic (llava/mm_utils.py)
  4. Update constants.py if token dimensions or special tokens change (llava/constants.py)

Deploy a new evaluation benchmark

  1. Create a new eval script in llava/eval/ (e.g., eval_newbench.py) that loads your dataset and formats queries (llava/eval/eval_newbench.py)
  2. Use run_llava.py pattern to iterate over samples, call model inference, and collect outputs (llava/eval/run_llava.py)
  3. Implement metric calculation (accuracy, F1, etc.) or integrate GPT-4 review like eval_gpt_review.py for subjective evaluation (llava/eval/eval_gpt_review.py)
  4. Store results in llava/eval/table/results/ directory following JSONL format (llava/eval/table/results/)

Extend the serving API with custom endpoints

  1. Add new request/response classes and route handlers in model_worker.py or create a new worker subclass (llava/serve/model_worker.py)
  2. Register routes in the controller for request routing and load distribution (llava/serve/controller.py)
  3. Update gradio_web_server.py if adding interactive UI components for your new endpoint (llava/serve/gradio_web_server.py)
  4. Document conversation format expectations and update conversation.py if custom prompt templating is needed (llava/conversation.py)

🔧Why these technologies

  • CLIP Vision Encoder — Proven feature extraction from diverse images with semantic understanding; seamlessly integrates with transformers
  • LLaMA / Mistral Language Models — Open-source, instruction-tuned LLMs with strong reasoning; compatible with LoRA and efficient fine-tuning
  • Projection Layer (MLP/Linear) — Bridges dimensionality gap between vision (e.g., 1024-dim CLIP) and language (e.g., 4096-dim LLaMA) embeddings
  • Gradio Web UI — Rapid prototyping of interactive demos without frontend engineering; built-in image upload and text streaming
  • Multi-worker architecture with controller — Enables distributed inference, load balancing, and easy scaling across GPUs

⚖️Trade-offs already made

  • Freeze vision encoder, train only projection + language model

    • Why: Reduces training compute and data requirements; leverages pre-trained CLIP features
    • Consequence: Limited vision-specific adaptation; model cannot learn domain-specific visual features beyond CLIP's capabilities
  • Use delta weights (model - base LM) for distribution

    • Why: Significantly reduces model size for redistribution; avoids licensing issues with base LLaMA weights
    • Consequence: Requires base model download and assembly step; more complex deployment than monolithic checkpoint
  • Conversation templates per model variant

    • Why: Accommodates different tokenizers and chat formats (LLaMA vs. Mistral vs. MPT)
    • Consequence: Added complexity in prompt engineering; easy to apply wrong template and degrade performance
  • Inference-only deployment (no fine-tuning in serve/)

    • Why: Simplifies production setup; training is offline in separate pipeline
    • Consequence: Cannot quickly adapt to new domains in-service; requires retraining and redeployment

🚫Non-goals (don't propose these)

  • Real-time video understanding (processes static images only)
  • Multi-turn conversation state persistence across deployments
  • Training distributed across multiple nodes (training scripts not provided in serve/)
  • Support for proprietary vision encoders (CLIP and open variants only)
  • Automatic hyperparameter tuning or NAS
  • Built-in A/B testing or experiment

🪤Traps & gotchas

  1. Model weights are large (7B–13B): default download can fail on bandwidth-constrained networks; use --load-4bit or --load-8bit flags in llava/serve/cli.py. 2. Vision encoder compatibility: switching from CLIP-ViT-L to CLIP-ViT-G requires retraining the connector; mismatched encoders silently produce degraded outputs. 3. Conversation template mismatch: llava/conversation.py has multiple Conversation subclasses; using wrong template for a model variant (v1 vs v1.5) breaks prompt formatting. 4. CUDA memory: inference needs ~16GB VRAM for 13B unquantized; DevContainer defaults to CPU if no GPU detected. 5. Data format strictness: llava/eval/model_vqa_loader.py expects precise JSON structure; malformed annotations cause silent skips. 6. GPT-4V evaluation: llava/eval/eval_gpt_review.py requires OPENAI_API_KEY env var and active OpenAI account; missing key fails silently.

🏗️Architecture

💡Concepts to learn

  • Visual Instruction Tuning — The core training paradigm in this repo: SFT on image-question-answer triplets (not just text). Understanding this distinction from standard LLM fine-tuning is essential for modifying training objectives.
  • Connector Architecture (Vision-Language Fusion) — LLaVA bridges frozen CLIP embeddings to frozen LLM inputs via a learned projection layer (v1) or QFormer (v1.5); knowing where this bottleneck sits explains performance tradeoffs and fine-tuning constraints.
  • LoRA (Low-Rank Adaptation) — Documented in docs/LoRA.md as the efficient fine-tuning method for LLaVA; reduces memory from full training while preserving model quality—critical for resource-constrained setups.
  • Token-Level Image Encoding (Patch Embedding) — CLIP produces sequence of patch embeddings that must be aligned with LLM token length; llava/model/multimodal_encoder/ handles this resampling—mismatch causes context length issues.
  • Modality Transfer / Zero-Shot Cross-Modal Generalization — LLaVA-NeXT demonstrates image-trained models work on video zero-shot; understanding why this works requires knowledge of visual feature redundancy and temporal coherence assumptions.
  • Instruction Following & In-Context Learning — The model's ability to follow natural language instructions (vs. task-specific fine-tuning) depends on SFT data diversity; docs/Data.md specifies the 80K instruction dataset composition that enables this.
  • Quantization (4-bit, 8-bit) — LLaVA supports bitsandbytes quantization flags (--load-4bit, --load-8bit) for inference; understanding memory/speed tradeoffs is crucial for deployment decisions in resource-constrained environments.
  • OpenGVLab/InternVL — Competing vision-language model with similar instruction-tuning approach but stronger vision encoder (InternViT); shares same training philosophy
  • THUDM/CogVLM — Alternative VLM combining visual experts with fine-grained vision understanding; different architecture (cross-attention vs connector) solving same problem
  • LLaVA-VL/LLaVA-NeXT — Official continuation/fork with stronger backbones (LLaMA-3, Qwen-1.5) and video support; where active development now lives
  • lmms-lab/lmms-eval — Standardized evaluation framework extracted from LLaVA team; essential for benchmarking custom LLaVA fine-tunes across datasets
  • ggerganov/llama.cpp — Community-maintained CPU/quantized inference for LLaVA; enables deployment on edge devices without GPU

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for llava/conversation.py

The conversation.py module is central to LLaVA's instruction tuning pipeline, but there are no visible test files in the repo. Adding unit tests would ensure conversation formatting, message handling, and templating work correctly across different model configurations. This is critical since conversation format bugs would silently degrade model quality.

  • [ ] Create tests/test_conversation.py with fixtures for different conversation templates
  • [ ] Test message appending, formatting, and prompt generation for both single and multi-turn scenarios
  • [ ] Test edge cases like empty conversations, special tokens, and different roles (user/assistant/system)
  • [ ] Run tests in CI pipeline (see next idea) to catch regressions

Create GitHub Actions CI workflow for Python linting and unit tests

The repo has .github/ISSUE_TEMPLATE configured but no visible GitHub Actions workflows (.github/workflows/). With multiple Python files across llava/eval/ and llava/ modules, automated testing on PR submission would catch regressions early, particularly in the evaluation scripts (eval_gpt_review.py, model_qa.py, etc.) that are frequently modified.

  • [ ] Create .github/workflows/tests.yml with Python 3.8+ matrix
  • [ ] Add linting checks using flake8/pylint on llava/ directory
  • [ ] Run pytest on tests/ directory (created in PR #1)
  • [ ] Add checks for the eval scripts in llava/eval/ to ensure they maintain backward compatibility
  • [ ] Document the CI requirements in CONTRIBUTING.md

Extract evaluation metrics logic from llava/eval/*.py into reusable modules

The eval directory contains multiple similar evaluation scripts (eval_science_qa.py, eval_textvqa.py, eval_pope.py) that likely share metric computation logic (BLEU, CIDEr, accuracy calculations). Consolidating shared metric functions into llava/eval/metrics.py would reduce duplication, improve maintainability, and make it easier for contributors to add new evaluation benchmarks.

  • [ ] Analyze common metric computations across eval_science_qa.py, eval_textvqa.py, and m4c_evaluator.py
  • [ ] Create llava/eval/metrics.py with extracted functions (e.g., compute_accuracy, compute_vqa_accuracy, etc.)
  • [ ] Refactor eval scripts to import from metrics.py instead of redefining logic
  • [ ] Add type hints and docstrings to metrics.py functions
  • [ ] Update docs/Evaluation.md to document the available metric functions for new benchmark additions

🌿Good first issues

  • Add missing evaluation metric for CIDEr score in llava/eval/m4c_evaluator.py: currently only implements BLEU/METEOR; CIDEr is used in MMBench but not computed locally, requiring external service call
  • Extend llava/conversation.py to support system-level instruction prefixes (e.g., role-playing prompts) by adding a system_prompt field to Conversation dataclass—currently only supports assistant/user messages
  • Document the exact connector architecture differences between v1 (linear projection) and v1.5 (QFormer) in llava/model/llava_llm.py with inline comments; the transition is mentioned in README but not explained in code

Top contributors

Click to expand

📝Recent commits

Click to expand
  • c121f04 — Update README.md (ChunyuanLI)
  • ec3a32d — Update README.md (ChunyuanLI)
  • 3e337ad — Update Evaluation.md (#1358) (ZhaoyangLi-nju)
  • dcda07f — Add Support for S^2 (#1376) (bfshi)
  • cfde6e0 — Merge pull request #1256 from ThugJudy/main (haotian-liu)
  • 890a5d1 — Merge pull request #1342 from Priyanshupareek/patch-1 (haotian-liu)
  • 19d8945 — Merge pull request #1367 from IlianP/patch-1 (haotian-liu)
  • 24ac78a — Merge pull request #1401 from nfrumkin/patch-1 (haotian-liu)
  • ad9fb95 — Update docstring for unpad_image (nfrumkin)
  • 30ca553 — Update MODEL_ZOO.md (IlianP)

🔒Security observations

The LLaVA codebase presents moderate security concerns primarily around external API integrations, model loading mechanisms, and input

  • Medium · Potential Hardcoded Credentials in Environment Files — .devcontainer/devcontainer.env. The presence of .devcontainer/devcontainer.env file suggests potential environment variable configurations. If this file contains API keys, credentials, or sensitive tokens (such as OpenAI API keys for GPT-4V evaluations), it could expose secrets if committed to version control. Fix: Ensure .devcontainer.env is listed in .gitignore. Use GitHub Secrets or similar mechanisms for CI/CD pipelines. Document required environment variables without including actual values in the repository.
  • High · Unrestricted Remote API Calls in Evaluation Scripts — llava/eval/eval_gpt_review*.py, llava/eval/qa_baseline_gpt35.py. Multiple evaluation scripts (eval_gpt_review.py, eval_gpt_review_bench.py, eval_gpt_review_visual.py, qa_baseline_gpt35.py) make calls to external APIs (GPT-4, GPT-3.5). Without proper input validation and rate limiting, these could be vulnerable to API key injection, SSRF attacks, or denial of service through uncontrolled API consumption. Fix: Implement strict input validation for all API parameters. Add rate limiting and request throttling. Use API key rotation mechanisms. Implement timeout controls on all external API calls. Validate response integrity from external services.
  • Medium · Potential Arbitrary Code Execution via Model Loading — llava/model/builder.py, llava/model/language_model/. The model builder (llava/model/builder.py) and language model files load pre-trained models. If model sources are not verified or if there's no integrity checking, attackers could serve malicious model files containing code execution payloads. Fix: Implement cryptographic verification of model checksums (SHA-256/Blake2). Use HTTPS-only for model downloads. Pin specific model versions. Implement sandboxing for model loading operations. Verify model sources and maintain a whitelist of trusted repositories.
  • Medium · Unvalidated File Input Processing — llava/mm_utils.py, llava/eval/model_vqa_loader.py. The codebase processes images and data files (llava/mm_utils.py handles multimodal data). Without strict file type validation and size limits, this could lead to denial of service attacks or processing of malicious files. Fix: Implement strict file type validation using magic bytes, not just extensions. Set file size limits. Validate image dimensions and formats. Use secure file handling libraries. Implement timeout controls for file processing.
  • Medium · Missing Dependency Lock File — Root directory (missing dependency lock file). No package lock file (requirements.txt, Pipfile.lock, or poetry.lock) is visible in the provided structure. This could allow installation of vulnerable dependency versions during deployment. Fix: Create and maintain a requirements.txt or pyproject.toml with pinned versions. Use pip-audit or similar tools to scan for known vulnerabilities. Implement automated dependency updates with security scanning in CI/CD.
  • Low · Potential Information Disclosure via Error Messages — llava/eval/eval_gpt_review.py, llava/eval/model_qa.py, llava/eval/model_vqa.py. Evaluation scripts that interact with external APIs may expose API endpoint details, model names, or request structures in error messages or logs, potentially aiding reconnaissance. Fix: Implement proper exception handling that logs full details only in debug mode. Sanitize error messages returned to users. Avoid exposing API endpoints, versions, or internal configuration in user-facing output.
  • Low · Docker Build Without Security Best Practices — .devcontainer/Dockerfile. The Dockerfile in .devcontainer/Dockerfile may not include security hardening measures such as non-root user execution, multi-stage builds, or minimal base images. Fix: Use minimal base images (alpine/distroless). Run containers as non-root user. Implement layer caching properly to avoid re-running operations. Scan Docker images with Trivy or similar tools. Use read-only filesystems where possible.

LLM-derived; treat as a starting point, not a security audit.

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

  1. Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
  2. Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
  3. Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/haotian-liu/LLaVA shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live haotian-liu/LLaVA repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/haotian-liu/LLaVA.

What it runs against: a local clone of haotian-liu/LLaVA — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in haotian-liu/LLaVA | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 665 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>haotian-liu/LLaVA</code></summary>
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of haotian-liu/LLaVA. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/haotian-liu/LLaVA.git
#   cd LLaVA
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of haotian-liu/LLaVA and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "haotian-liu/LLaVA(\\.git)?\\b" \\
  && ok "origin remote is haotian-liu/LLaVA" \\
  || miss "origin remote is not haotian-liu/LLaVA (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "llava/model/builder.py" \\
  && ok "llava/model/builder.py" \\
  || miss "missing critical file: llava/model/builder.py"
test -f "llava/model/llava_arch.py" \\
  && ok "llava/model/llava_arch.py" \\
  || miss "missing critical file: llava/model/llava_arch.py"
test -f "llava/mm_utils.py" \\
  && ok "llava/mm_utils.py" \\
  || miss "missing critical file: llava/mm_utils.py"
test -f "llava/conversation.py" \\
  && ok "llava/conversation.py" \\
  || miss "missing critical file: llava/conversation.py"
test -f "llava/model/multimodal_encoder/clip_encoder.py" \\
  && ok "llava/model/multimodal_encoder/clip_encoder.py" \\
  || miss "missing critical file: llava/model/multimodal_encoder/clip_encoder.py"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 665 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~635d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/haotian-liu/LLaVA"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

Embed this chat in your README →

Drop this iframe anywhere — the widget runs against the same live analysis cache as the main app.

<iframe
  src="https://repopilot.app/embed/haotian-liu/llava"
  width="100%" height="500"
  style="border:1px solid #d0d7de; border-radius:8px;"
  allow="microphone"
  loading="lazy"
></iframe>