Testing a New Model#
When adding a new model under veomni/models/transformers/<model>/, two test
suites need updating:
tests/models/test_models_patch.py— single-GPU forward/backward correctness across attention and MoE backends.tests/e2e/test_e2e_parallel.py— multi-GPU e2e training with FSDP2, sequence parallelism (SP), and expert parallelism (EP).
For VLM models, there is also a lightweight trainer-level smoke test for
freeze_vit:
tests/models/test_vlm_trainer.py— builds a real toy VLM model on CPU and checks that vision parameters stay trainable whenfreeze_vit=Falseand are frozen whenfreeze_vit=True.
For VLM / Omni models, an additional multi-GPU test covers the FSDP2 asymmetric-forward path:
tests/distributed/test_dummy_forward.py— verifies rank-asymmetric multimodal batches (rank 0 has images/video/audio, others text-only) do not hang NCCL under FSDP2, exercising the model’sdummy_forward()hook.
1. tests/models/test_models_patch.py#
What it tests#
Runs one forward + backward step on dummy data for every combination of:
HF attention backends (
eager,flash_attention_2,flash_attention_3)VeOmni attention backends (
veomni_flash_attention_2_with_sp,veomni_flash_attention_3_with_sp)MoE backends (for MoE models:
eager,fused)
Then asserts that loss and grad norm match across all combinations within
(rtol, atol).
How to add a case#
Append a pytest.param(...) to the test-cases list:
pytest.param(
"./tests/toy_config/<new_model>_toy/config.json",
False, # is_moe — set True for MoE models
_DEFAULT_RTOL,
_DEFAULT_ATOL,
id="<new_model>",
),
The id= string is used as a key for:
Test node naming (
pytest -k <id>)Looking up custom weight sync functions in
weight_sync_adapters.py(only needed if the model has non-standard state dict keys)
Filtering unsupported modes#
If the model doesn’t support certain attention backends yet, add a filter
block in test_models_patch_fwd_bwd keyed on case_id:
if case_id == "<new_model>":
hf_model_modes = [mode for mode in hf_model_modes if mode.attn_implementation != "flash_attention_3"]
veomni_model_modes = [
mode for mode in veomni_model_modes if mode.attn_implementation != "veomni_flash_attention_3_with_sp"
]
Toy config#
Create a minimal config under tests/toy_config/<new_model>_toy/config.json
with few layers. Add a README.md under the same folder to indicate:
Where the original config is from
What changes are made from the original config
2. tests/e2e/test_e2e_parallel.py#
What it tests#
Launches full torchrun training runs (2 epochs, 2 steps) across parallel
configurations (FSDP2 always enabled):
Parameter |
Values |
|---|---|
|
1, 2 |
|
1 (base models), 1×2 (MoE models) |
Each run produces a log_dict.json. The test asserts that loss and grad norm
match across all SP/EP configurations within (rtol, atol).
How to add a case#
Add an entry to text_test_cases (for text-only models):
pytest.param(
"<new_model>",
"./tests/toy_config/<new_model>_toy/config.json",
False, # is_moe
_DEFAULT_RTOL,
_DEFAULT_ATOL,
None, # max_sp_size
),
Parametrize fields#
The text_test_cases parametrize string is:
"model_name, config_path, is_moe, rtol, atol, max_sp_size"
Field |
Type |
Description |
|---|---|---|
|
|
Used for directory naming and log output |
|
|
Path to toy config directory or |
|
|
If |
|
|
Tolerances for cross-config comparison |
|
|
|
Limiting sequence parallelism#
If the model does not support SP yet, set max_sp_size=1 to only run with
sp_size=1:
pytest.param(
"qwen3_5",
"./tests/toy_config/qwen3_5_toy/config.json",
False, # is_moe
_DEFAULT_RTOL,
_DEFAULT_ATOL,
1, # max_sp_size — remove once SP is supported
),
VLM / multimodal models#
For vision-language or multimodal models, add to the appropriate test case
list (qwen2vl_test_cases, qwen3vl_test_cases, etc.) and pair with the
matching fixture and test function. The same max_sp_size field is available.
3. tests/models/test_vlm_trainer.py#
What it tests#
Builds a real toy VLM model and calls VLMTrainer._freeze_model_module()
directly. The test only checks one behavior:
freeze_vit=False-> the vision tower parameters remain trainablefreeze_vit=True-> the vision tower parameters are frozen
This is intentionally simpler than an e2e training test. It is meant to
catch model wrapper path changes such as model.visual vs model.model.visual.
How to add a case#
Add your toy config to the freeze-ViT VLM cases list:
pytest.param("./tests/toy_config/<new_vlm_model>_toy/config.json", id="<new_vlm_model>"),
4. tests/distributed/test_dummy_forward.py (VLM / Omni only)#
What it tests#
Launches a 2-GPU torchrun job under FSDP2. Rank 0 receives a multimodal
batch (images + video, and audio for Omni); other ranks receive text-only.
Both ranks must complete forward + backward without an NCCL timeout, which
proves that the model’s dummy_forward() hook (or equivalent) fires on
text-only ranks so every rank participates in FSDP collectives.
Any model whose patchgen-generated modeling overrides
<M>VisionTransformerPretrainedModel.dummy_forward (or similar) must add a
case here — this suite is the only coverage for that override on multi-GPU.
How to add a case#
Append an entry to _vlm_cases (VLM) or _omni_cases (Omni):
pytest.param(
"<new_vlm_model>",
"./tests/toy_config/<new_vlm_model>_toy",
partial(_vlm_batch, patch_size=<P>),
id="<new_vlm_model>",
),
patch_size must match the model’s vision patch size (e.g. 14 for
Qwen2.5-VL, 16 for Qwen3-VL). Run with
pytest tests/distributed/test_dummy_forward.py -k <new_vlm_model> on a
2-GPU host.
Checklist#
When adding a new model, verify:
[ ] Toy config created under
tests/toy_config/<model>_toy/[ ] Entry added to the test-cases list in
test_models_patch.py[ ] Unsupported attention/MoE modes filtered in
test_models_patch_fwd_bwdif needed[ ] Entry added to
text_test_cases(or VLM equivalent) intest_e2e_parallel.py[ ] For VLM models, toy config added to the freeze-ViT VLM cases list in
tests/models/test_vlm_trainer.py[ ]
max_sp_sizeset appropriately (use1if SP not supported,Noneotherwise)[ ]
pytest --collect-only -k <model>shows expected test cases[ ] Tests pass:
pytest tests/models/test_models_patch.py -k <model>andpytest tests/e2e/test_e2e_parallel.py -k <model>[ ] For VLM models,
pytest tests/models/test_vlm_trainer.py -k <model>passes[ ] For VLM / Omni models, an entry added to
_vlm_cases/_omni_casesintests/distributed/test_dummy_forward.py(required on any model that overridesdummy_forward)[ ]
pytest tests/distributed/test_dummy_forward.py -k <model>passes on 2 GPUs