Skip to main content

Ctrl+K

Get Started

Installation with Nvidia GPU
Installation with Ascend NPU (x86)
Installation with Ascend NPU (ARM)

Usage

Arguments API Reference
Basic Modules
Multimodal Data Processing
Usage of dyn_bsz
Support New Models — Guide and Checklist
Qwen3-VL MoE Integration Example
Qwen3-Omni-MoE Integration Example
Support New DiT Models — Guide and Reference
Checkpoint Conversion
Trainer
Agent Workflow Guide

Hardware Support

Get Started with Ascend NPU
Typical Usage: Qwen3-VL 8B Training on Ascend NPU
Ascend Environment Variables
Precision Analysis and Troubleshooting Guide
Model Optimization - Profiling Collection, Analysis and Optimization Ideas
Ascend A2 Docker Image Build and Usage Guide
Ascend A3 Docker Image Build and Usage Guide
FAQ: Common Issues and Solutions for Ascend NPU

Examples

Qwen3 training guide
Qwen3.5 training guide
Qwen3 MoE training guide
Qwen3 VL training guide
Qwen3 Omni MoE training guide
Wan2.1-I2V training guide
Qwen3 DPO training guide

Key Features

Adding a New Model to VeOmni
Custom Preprocessor Registry
EP+FSDP2 for Large-scale MoE Model Training
Long-Sequence Training Using Ulysses
LoRA Fine-Tuning

Design

Kernel Selection in VeOmni

Transformers v5 Updates

Transformers v5 Notes

Index

By Qianli Ma, Yaowei Zheng, Zhongkai Zhao, Bin jia, Ziyue Huang, Zhelun Shi, Zhi Zhang

© Copyright 2025 ByteDance Seed Foundation MLSys Team.