Transformers v5 Compatibility Updates#

This section documents VeOmni’s compatibility work for HuggingFace transformers>=5.0.0.

Included Updates#

Flash Attention custom-name handling: explains why _lazy_imports failed for VeOmni custom attention names and how the local hub-kernel loader adapter resolves it.
Qwen3 patchgen workflow: explains the modeling code generation workflow used for Qwen3 GPU patches and regeneration.
Transformers v5 MoE weight loading: explains how VeOmni expects MoE expert weights for v5 and documents qwen3_moe handling.
Testing a new model: SOP for adding test cases in test_models_patch.py and test_e2e_parallel.py when onboarding a new v5 model.