PKU-YuanGroup/Open-Sora-Plan

Item: PKU-YuanGroup/Open-Sora-Plan
Rating: 5
Author: RepoPilot

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Healthy

Healthy across all four use cases

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 2mo ago
✓9 active contributors
✓MIT licensed

Show all 6 evidence items →

✓CI configured
⚠Concentrated ownership — top contributor handles 60% of recent commits
⚠No test directory detected

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/pku-yuangroup/open-sora-plan)](https://repopilot.app/r/pku-yuangroup/open-sora-plan)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/pku-yuangroup/open-sora-plan on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: PKU-YuanGroup/Open-Sora-Plan

Generated by RepoPilot · 2026-05-07 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/PKU-YuanGroup/Open-Sora-Plan shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

Last commit 2mo ago
9 active contributors
MIT licensed
CI configured
⚠ Concentrated ownership — top contributor handles 60% of recent commits
⚠ No test directory detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live PKU-YuanGroup/Open-Sora-Plan repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/PKU-YuanGroup/Open-Sora-Plan.

What it runs against: a local clone of PKU-YuanGroup/Open-Sora-Plan — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in PKU-YuanGroup/Open-Sora-Plan | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 90 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>PKU-YuanGroup/Open-Sora-Plan</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of PKU-YuanGroup/Open-Sora-Plan. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/PKU-YuanGroup/Open-Sora-Plan.git
#   cd Open-Sora-Plan
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of PKU-YuanGroup/Open-Sora-Plan and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "PKU-YuanGroup/Open-Sora-Plan(\\.git)?\\b" \\
  && ok "origin remote is PKU-YuanGroup/Open-Sora-Plan" \\
  || miss "origin remote is not PKU-YuanGroup/Open-Sora-Plan (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "opensora/models/diffusion/__init__.py" \\
  && ok "opensora/models/diffusion/__init__.py" \\
  || miss "missing critical file: opensora/models/diffusion/__init__.py"
test -f "opensora/models/causalvideovae/model/vae/modeling_causalvae.py" \\
  && ok "opensora/models/causalvideovae/model/vae/modeling_causalvae.py" \\
  || miss "missing critical file: opensora/models/causalvideovae/model/vae/modeling_causalvae.py"
test -f "opensora/adaptor/engine.py" \\
  && ok "opensora/adaptor/engine.py" \\
  || miss "missing critical file: opensora/adaptor/engine.py"
test -f "opensora/dataset/t2v_datasets.py" \\
  && ok "opensora/dataset/t2v_datasets.py" \\
  || miss "missing critical file: opensora/dataset/t2v_datasets.py"
test -f "opensora/acceleration/parallel_states.py" \\
  && ok "opensora/acceleration/parallel_states.py" \\
  || miss "missing critical file: opensora/acceleration/parallel_states.py"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 90 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~60d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/PKU-YuanGroup/Open-Sora-Plan"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Open-Sora-Plan is an open-source reproduction of OpenAI's Sora text-to-video (T2V) model, implementing a diffusion-based video generation pipeline with Causal Video VAE compression and staged training. It provides end-to-end infrastructure for training and inference of large-scale video generation models, currently optimized for Huawei Ascend NPUs with CUDA fallback support. Modular monorepo: opensora/models/ contains the core VAE and diffusion architectures, opensora/dataset/ handles T2V dataset loading (t2v_datasets.py, inpaint_dataset.py), opensora/adaptor/ wraps distributed training with stage-based pipelines, opensora/acceleration/ provides communication and parallel state abstractions. Examples/ and docs/ contain runnable inference scripts (rec_video.py, rec_image.py) and versioned architecture reports.

👥Who it's for

ML researchers and engineers building video generation systems, particularly those at research institutions (PKU-backed) or working with Huawei Ascend hardware; contributors aiming to reproduce or improve upon Sora's architecture; teams needing scalable T2V model training infrastructure.

🌱Maturity & risk

Actively developed with v1.5 released and multiple versioned reports (v1.0–v1.5.0) indicating ongoing iteration; strong community engagement (Discord, WeChat, Twitter presence) and published ArXiv papers backing the work. However, this is a research reproduction project, not an official implementation—expect API changes and Ascend-specific optimizations that may not transfer cleanly to CUDA environments.

Heavy Ascend NPU specialization (v1.5 explicitly 'pure Ascend') may limit portability; distributed training code (opensora/acceleration/) requires careful parallel state management; VAE and diffusion model dependencies are custom-built (opensora/models/causalvideovae/), increasing surface area for bugs. Single maintainer pattern common in academic repos increases dependency on key contributors.

Active areas of work

V1.5 released with pure Ascend backend implementation; active recruitment for algorithm engineers; rapid iteration indicated by 6+ versioned reports since v1.0.0; migration toward Ascend-centric development while maintaining CUDA support. Recent work appears focused on VAE improvements (docs/VAE.md) and prompt refinement (docs/Prompt_Refiner.md).

🚀Get running

Clone and install: git clone https://github.com/PKU-YuanGroup/Open-Sora-Plan && cd Open-Sora-Plan && pip install -e .. For inference, run python examples/rec_video.py with a prompt from examples/sora.txt. For training, configure hardware (Ascend or CUDA) and stage selection in opensora/adaptor/ before launching distributed training.

Daily commands: Inference: python examples/rec_video.py --prompt 'your prompt here' (requires pretrained checkpoint). Training: torchrun --nproc_per_node=8 train.py --config <config.yaml> (exact command in CI: see .github/workflows/docker_build.yml). Environment setup: pip install torch torchvision torchaudio + Ascend or CUDA toolkit.

🗺️Map of the codebase

opensora/models/diffusion/__init__.py — Entry point for the diffusion model pipeline that forms the core of the Sora reproduction; all text-to-video inference flows through here
opensora/models/causalvideovae/model/vae/modeling_causalvae.py — Core VAE encoder/decoder implementation; essential for understanding video compression and reconstruction which is foundational to the entire pipeline
opensora/adaptor/engine.py — Training and inference engine orchestrator; coordinates distributed training, acceleration, and model execution across the framework
opensora/dataset/t2v_datasets.py — Dataset loading and preprocessing for text-to-video training; defines how training data flows through the pipeline
opensora/acceleration/parallel_states.py — Manages distributed training state and parallelization strategies; critical for multi-GPU/multi-node scaling on Huawei Ascend
opensora/models/causalvideovae/model/trainer_videobase.py — VAE training loop and optimization logic; demonstrates the training patterns used throughout the codebase
opensora/adaptor/modules.py — Custom module wrappers and neural network building blocks; defines reusable components for model architecture

🛠️How to make changes

Add a new video diffusion model architecture

Create new model class in opensora/models/diffusion/ inheriting from base diffusion model (opensora/models/diffusion/__init__.py)
Register model in the model registry for loading by name (opensora/models/causalvideovae/model/registry.py)
Add model-specific configuration in config file (see existing configs in docs/) (docs/Report-v1.5.0.md)
Update adaptor/engine.py to handle new model initialization if special training logic needed (opensora/adaptor/engine.py)

Add support for a new dataset format

Create new dataset class in opensora/dataset/ inheriting from existing dataset base (opensora/dataset/t2v_datasets.py)
Implement video loading and text/metadata parsing for your format (opensora/models/causalvideovae/dataset/video_dataset.py)
Add transform pipeline if custom augmentation needed (opensora/dataset/transform.py)
Register dataset in config and test with evaluation pipeline (opensora/models/causalvideovae/eval/eval.py)

Add a new neural network module or attention variant

Create module class in opensora/models/causalvideovae/model/modules/ (opensora/models/causalvideovae/model/modules/attention.py)
Implement forward pass with proper shape handling for batch, frames, height, width, channels (opensora/models/causalvideovae/model/modules/block.py)
Add module to init.py for easy importing (opensora/models/causalvideovae/model/modules/__init__.py)
Integrate into model architecture by modifying modeling_videobase.py or modeling_causalvae.py (opensora/models/causalvideovae/model/modeling_videobase.py)

Integrate a new evaluation metric

Create metric calculation function in opensora/models/causalvideovae/eval/ (opensora/models/causalvideovae/eval/cal_fvd.py)
Add metric to the main evaluation pipeline (opensora/models/causalvideovae/eval/eval.py)
Create shell script wrapper in eval/script/ for easy CLI access (opensora/models/causalvideovae/eval/script/cal_fvd.sh)
Update trainer to log metric during validation (opensora/models/causalvideovae/model/trainer_videobase.py)

🪤Traps & gotchas

Ascend-specific code paths in v1.5 (opensora/adaptor/) may fail silently on pure CUDA setups—check hardware detection before training. BFloat16 optimizer (opensora/adaptor/bf16_optimizer.py) requires matching dtype throughout the pipeline or gradients explode. Virtual disk abstraction (opensora/dataset/virtual_disk.py) may bottleneck I/O on non-local storage. Multi-stage training checkpoint loading assumes exact stage ordering (stage_1_and_2.py); reordering stages breaks resumption. No unit tests present; integration tests rely on full training runs.

🏗️Architecture

💡Concepts to learn

Causal Video VAE — Open-Sora-Plan's core compression mechanism; reduces video frames to latent tokens for efficient diffusion training, as explained in docs/VAE.md
Latent Diffusion Models — Sora's generative mechanism; Open-Sora-Plan trains diffusion in the VAE latent space rather than pixel space for scalability
Multi-stage Training — Open-Sora-Plan trains VAE then diffusion sequentially (opensora/adaptor/stage_1_and_2.py); understanding stage dependencies is critical for checkpoint management
Tensor Parallelism & Pipeline Parallelism — opensora/acceleration/parallel_states.py implements these to scale across Ascend NPUs; required reading for distributed training contributions
BFloat16 Mixed Precision — opensora/adaptor/bf16_optimizer.py uses reduced-precision gradients for memory efficiency; understanding overflow/underflow risks is essential for training stability
Text-Video Dataset Sampling — opensora/dataset/t2v_datasets.py handles frame dropping, resolution bucketing, and text-video alignment—key to reproducible training
Inpainting & Conditional Generation — opensora/dataset/inpaint_dataset.py enables masked region generation; required for Sora's variable-size and edit-capable outputs

openai/shap-e — OpenAI's 3D generative model; shares diffusion and VAE training patterns with Sora reproduction
lucidrains/imagen-pytorch — PyTorch implementation of Imagen (Google's T2I model); demonstrates diffusion-based generation pipeline similar to Sora's architecture
CompVis/stable-diffusion — Foundational latent diffusion model; Open-Sora-Plan reuses VAE and diffusion scheduler concepts from this work
PKU-YuanGroup/Helios — Companion project from same lab (linked in badges); extends Open-Sora-Plan with additional improvements and ablations
AlibabaResearch/Open-Sora — Alternative open-source Sora reproduction with different architecture choices; useful for comparing training approaches

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for VAE model components

The repo has extensive VAE implementation in opensora/models/causalvideovae/model/ but no visible test files. Given the complexity of video VAE (encoder/decoder, losses including discriminator and LPIPS), adding unit tests would catch regressions early and help contributors validate their changes. This is critical for a reproduction project where model correctness is paramount.

[ ] Create tests/models/causalvideovae/ directory structure
[ ] Add unit tests for opensora/models/causalvideovae/model/losses/discriminator.py covering forward pass and loss computation
[ ] Add unit tests for opensora/models/causalvideovae/model/losses/lpips.py and perceptual_loss.py
[ ] Add integration tests for encoder/decoder in opensora/models/causalvideovae/model/ validating shape transformations
[ ] Create tests/models/causalvideovae/test_ema_model.py for EMA model validation
[ ] Add tests/conftest.py with fixtures for dummy video tensors and model initialization

Add GitHub Actions workflow for dataset validation and data loading tests

The repo has multiple dataset classes (opensora/dataset/t2v_datasets.py, opensora/models/causalvideovae/dataset/video_dataset.py, opensora/dataset/inpaint_dataset.py) but no CI validation that these can load data correctly. A workflow checking dataset initialization, transform pipelines, and batch loading would prevent contributors from breaking dataset functionality.

[ ] Create .github/workflows/dataset_validation.yml (extending the existing docker_build.yml pattern)
[ ] Add test_dataset_initialization.py validating opensora/dataset/transform.py transformations work correctly
[ ] Add integration tests for opensora/dataset/virtual_disk.py to ensure virtual disk mounting works
[ ] Add tests for opensora/models/causalvideovae/dataset/transform.py with mock video data
[ ] Configure workflow to run on PR with pytest and coverage reporting for dataset modules only

Add missing evaluation utility tests and refactor eval scripts into reusable modules

The opensora/models/causalvideovae/eval/ directory has multiple evaluation scripts (cal_fvd.py, cal_lpips.py, cal_psnr.py, cal_ssim.py) and shell scripts (eval/script/*.sh) but these appear as standalone scripts without tests. Refactoring these into importable modules with unit tests would allow reproducible evaluation and prevent metric computation bugs.

[ ] Create opensora/models/causalvideovae/eval/metrics.py that consolidates metric calculation logic from cal_*.py files
[ ] Refactor each eval/cal_*.py to import from metrics.py and add main entry point
[ ] Create tests/models/causalvideovae/eval/test_metrics.py with unit tests for PSNR, SSIM, LPIPS against known ground truth values
[ ] Add tests for FVD calculation with mock I3D features in tests/models/causalvideovae/eval/test_fvd.py
[ ] Update eval/script/*.sh to call the refactored Python modules instead of standalone scripts
[ ] Document metric usage in docs/ with expected input shapes and output ranges

🌿Good first issues

Add pytest test suite for opensora/dataset/transform.py covering edge cases (frame rate mismatch, resolution scaling, audio sync)—currently zero test coverage for data pipeline transforms
Implement CUDA-specific fast path for opensora/acceleration/communications.py to reduce Ascend-only performance gap and improve cross-platform support—currently hardcoded for Ascend collectives
Document the exact checkpoint format and migration path in docs/ when upgrading from v1.4 to v1.5 (Ascend refactor)—currently missing, breaks user reproducibility

⭐Top contributors

Click to expand

@LinB203 — 60 commits
@yunyangge — 30 commits
@qqingzheng — 3 commits
@Purshow — 2 commits
@SHYuanBest — 1 commits

📝Recent commits

Click to expand

f7fa604 — Introducing Helios (SHYuanBest)
1c8209e — Update Report-v1.5.0.md (yunyangge)
bbfa51c — Update Report-v1.5.0_cn.md (yunyangge)
afb9416 — Update LICENSE (LinB203)
a66a636 — Update LICENSE (LinB203)
278a1c4 — fix vbench results (yunyangge)
d2973c2 — Update Report-v1.5.0.md (LinB203)
617c4bf — Update Report-v1.5.0_cn.md (yunyangge)
bad1e21 — Update Report-v1.5.0.md (yunyangge)
ff9e079 — Update Report-v1.5.0_cn.md (yunyangge)

🔒Security observations

The Open-Sora-Plan project shows moderate security posture with notable gaps. Primary concerns include: (1) No dependency file provided for vulnerability scanning - a critical gap for Python projects with numerous ML dependencies, (2) File I/O operations in virtual_disk and dataset modules lack visible input validation, (3) External data loading without apparent sanitization, (4) Docker build workflow needs hardening best practices. Strengths include proper licensing (Apache), use of GitHub Actions, and structured codebase organization. Immediate actions should focus on providing and auditing dependencies, implementing strict input validation in data loading pipelines, and documenting security practices for model/data sourcing.

Medium · Missing Dependencies File for Vulnerability Audit — Root directory - missing dependency manifest. No package requirements file (requirements.txt, setup.py, pyproject.toml, poetry.lock, or Pipfile) was provided for analysis. This prevents identification of known vulnerable dependencies that could be present in the project. Fix: Provide and maintain dependency files (requirements.txt or pyproject.toml). Use tools like pip-audit, safety, or Dependabot to regularly scan for vulnerable packages.
Medium · Potential Unsafe File Operations in Virtual Disk Module — opensora/dataset/virtual_disk.py. The presence of 'opensora/dataset/virtual_disk.py' suggests file I/O operations that may lack proper input validation, path traversal protections, or permission checks. Without code review, this could be vulnerable to directory traversal attacks or unauthorized file access. Fix: Implement strict input validation for file paths, use os.path.abspath() and verify paths remain within allowed directories. Apply principle of least privilege for file operations.
Medium · Dataset Loading from External Sources — opensora/dataset/ directory (t2v_datasets.py, video_dataset.py, inpaint_dataset.py). Multiple dataset modules (t2v_datasets.py, video_dataset.py, inpaint_dataset.py) load data that could originate from user inputs or external sources. Without proper validation, this could lead to malicious data injection or deserialization vulnerabilities. Fix: Implement strict schema validation for all loaded datasets. Use safe deserialization practices. Validate image/video file formats and sanitize metadata before processing.
Low · Docker Build Workflow Present — .github/workflows/docker_build.yml. A Docker build workflow exists (.github/workflows/docker_build.yml) but content was not provided. Docker builds can introduce supply chain risks if base images are not properly pinned or if secrets are accidentally exposed in layers. Fix: Pin Docker base image versions (avoid 'latest'). Use multi-stage builds. Scan images with tools like Trivy or Snyk. Never commit secrets in Dockerfiles or docker-compose files.
Low · Model Weights and Training Data Security — opensora/models/ directory. The project involves loading and training large model files (causalvideovae, VAE models) which could be vulnerable to model poisoning or supply chain attacks if downloaded from untrusted sources. Fix: Implement integrity checks (checksums/signatures) for downloaded model weights. Document model sources. Use secure download protocols (HTTPS). Consider signed releases.
Low · No Visible Input Sanitization in Prompt Processing — docs/Prompt_Refiner.md and examples/cond_prompt.txt. Files like 'Prompt_Refiner.md' and 'examples/cond_prompt.txt' suggest prompt/text input handling. Without visible sanitization code, there's potential for prompt injection attacks if user inputs are not validated. Fix: Sanitize and validate all user-provided prompts. Implement allowlists for special characters if applicable. Use parameterized/template-based approaches to avoid injection.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

PKU-YuanGroup/Open-Sora-Plan

Embed the "Healthy" badge

Onboarding doc

Onboarding: PKU-YuanGroup/Open-Sora-Plan

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🛠️How to make changes

Add a new video diffusion model architecture

Add support for a new dataset format

Add a new neural network module or attention variant

Integrate a new evaluation metric

🪤Traps & gotchas

🏗️Architecture

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add comprehensive unit tests for VAE model components

Add GitHub Actions workflow for dataset validation and data loading tests

Add missing evaluation utility tests and refactor eval scripts into reusable modules

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next