Transformers v5 Notes#
This section documents VeOmni’s integration with HuggingFace
transformers==5.2.0 (the only supported transformers version).
Included Notes#
Flash Attention custom-name handling: explains why
_lazy_importsfails for VeOmni custom attention names and how the local hub-kernel loader adapter resolves it.Patchgen workflow: explains the modeling code generation workflow used for every supported model and how to regenerate.
MoE weight loading: explains how VeOmni expects MoE expert weights to be laid out and documents qwen3_moe handling.
Testing a new model: SOP for adding test cases in
test_models_patch.pyandtest_e2e_parallel.pywhen onboarding a new model.