Commit 995a15d
committed
refactor(engine): auto-derive padded-seq layout from model type
The padded (BSHD) vs packed (THD) forward layout is a hard architectural
property of the model -- GDN/SSM kernels (the Qwen3.5 family) reject
packed sequences -- not a user tunable. Exposing it as the
`use_padded_seq` config field let it be mis-set and risked silent
correctness or crash issues. Derive it from `model_type` instead so the
layout can never disagree with the architecture.
Also surface a startup warning when `use_bridge_for_update_weights=True`
but a fallback condition (non-megatron-bridge, FP8/quantized, or LoRA)
silently routes live weight sync through the registry path, so the
effective behavior is visible in logs.
Key changes:
- Add requires_padded_seq(model_type) helper in engine/core/model.py
- Derive self.use_padded_seq from model_type in MegatronEngine.initialize
- Remove use_padded_seq from MegatronEngineConfig and regenerate CLI docs
- Warn once when bridge weight-sync falls back to the registry path
- Drop the test-runner override map and example yaml flag
Refs: areal-project#13841 parent 6356c8e commit 995a15d
8 files changed
Lines changed: 103 additions & 94 deletions
File tree
- areal
- api
- engine
- core
- docs
- en
- zh
- examples/vlm
- tests
- torchrun
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
950 | 950 | | |
951 | 951 | | |
952 | 952 | | |
953 | | - | |
954 | | - | |
955 | | - | |
956 | | - | |
957 | | - | |
958 | | - | |
959 | | - | |
960 | | - | |
961 | | - | |
962 | | - | |
963 | | - | |
964 | | - | |
965 | 953 | | |
966 | 954 | | |
967 | 955 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
88 | 98 | | |
89 | 99 | | |
90 | 100 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
| |||
347 | 348 | | |
348 | 349 | | |
349 | 350 | | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
350 | 355 | | |
351 | 356 | | |
352 | 357 | | |
| |||
362 | 367 | | |
363 | 368 | | |
364 | 369 | | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
| 370 | + | |
369 | 371 | | |
370 | | - | |
371 | | - | |
372 | | - | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
373 | 376 | | |
374 | 377 | | |
375 | 378 | | |
| |||
380 | 383 | | |
381 | 384 | | |
382 | 385 | | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
383 | 404 | | |
384 | 405 | | |
385 | 406 | | |
| |||
856 | 877 | | |
857 | 878 | | |
858 | 879 | | |
859 | | - | |
| 880 | + | |
860 | 881 | | |
861 | 882 | | |
862 | 883 | | |
| |||
0 commit comments