Qwen3 Omni MoE training guide#
Download multisource dataset#
tulu-3-sft-mixture#
Download the tulu-3-sft-mixture dataset.
The directory structure should be like this:
VeOmni ├—— tulu-3-sft-mixture/ ├ └—— data/ ├ ├—— train-00000-of-00006.parquet ├ ├—— train-00001-of-00006.parquet ├ ├—— train-00002-of-00006.parquet ├ ├—— train-00003-of-00006.parquet ├ ├—— train-00004-of-00006.parquet ├ └—— train-00005-of-00006.parquet └—— ...(code files)
LLaVA-Video-178K#
Download the LLaVA-Video-178K dataset. Extract all tar.gz files to the root directory of VeOmni.
Modify the 0_30_s_academic_mc_v0_1_qa_processed.json and generate video.json.
import json
with open('0_30_s_academic_mc_v0_1_qa_processed.json', 'r', encoding='utf-8') as f:
data = json.load(f)
new_data = []
for item in data:
new_item = item.copy()
image_path = new_item.pop('video')
new_item['videos'] = [image_path]
new_data.append(new_item)
with open('video.json', 'w', encoding='utf-8') as f:
json.dump(new_data, f, ensure_ascii=False, indent=4)
The directory structure should be like this:
VeOmni ├—— 0_30_s_academic_mc_v0_1_qa_processed.json ├—— video.json ├—— academic_source/ ├ ├—— activitynet/ ├ ├ └—— ... ├ ├—— Charades/ ├ ├ └—— ... ├ ├—— ego4d/ ├ ├ └—— ... ├ ├—— NextQA/ ├ ├ └—— ... ├ └—— youcook2/ ├ └—— ... └—— ...(code files)
modify multisource yaml#
Modify configs/multimodal/data/tulu_sharegpt4v_llavavideo.yaml:
sources:
- sharegpt4v_instruct_gpt4-vision_cap100k_coco.json
- tulu-3-sft-mixture/data
- video.json
names:
- sharegpt4v_captioner_sft
- tulu-3-sft-mixture
- LLaVA-Video-178K
schedule:
- schedule_type: const
weights: [0.4, 0.2, 0.4]
Download Qwen3-Omni-MoE model#
python3 scripts/download_hf_model.py \
--repo_id Qwen/Qwen3-Omni-30B-A3B-Instruct \
--local_dir .
Merge MoE#
python3 scripts/moe_ckpt_merge/moe_merge.py \
--raw_hf_path Qwen3-Omni-30B-A3B-Instruct \
--merge_hf_path Qwen3-Omni-30B-A3B-Instruct-merge
Start training on GPU#
bash train.sh tasks/train_vlm.py configs/multimodal/qwen3_omni/qwen3_omni.yaml \
--model.model_path Qwen3-Omni-30B-A3B-Instruct-merge