-
Notifications
You must be signed in to change notification settings - Fork 413
Pull requests: vllm-project/vllm-ascend
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix the bug that causes Eagle3 inference failures under high concurrency and improve the acceptance rate of draft models
#2794
opened Sep 7, 2025 by
liumail680
Loading…
replace npu_incre_flash_attention with npu_fused_infer_attention_score
#2792
opened Sep 6, 2025 by
panchao-hub
Loading…
refactor fused_moe.py
module:core
module:ops
module:quantization
module:tests
#2791
opened Sep 6, 2025 by
Pr0Wh1teGivee
Loading…
Deepseek Mtp model uses the lm_head and embedding from the main model
module:tests
#2790
opened Sep 5, 2025 by
zzhx1
Loading…
[2/N][Refactor][Quantization] clean quantization patch
module:ops
module:quantization
module:tests
#2785
opened Sep 5, 2025 by
22dimensions
Loading…
[Perf][V1] Fully overlap model execution
merge-conflicts
#2783
opened Sep 5, 2025 by
jiangpeng36
Loading…
Remove chunked_prefill_for_mla and fix ring_mla bug
documentation
Improvements or additions to documentation
module:core
#2781
opened Sep 5, 2025 by
SunnyLee151064
Loading…
support qwen25 vl w8a8 quantization
module:quantization
module:tests
#2778
opened Sep 5, 2025 by
wenba0
Loading…
support qwen25 vl w8a8 quantization
module:quantization
module:tests
#2777
opened Sep 5, 2025 by
wenba0
Loading…
Refactor the Spec decode module to merge MTP non-torchair and eagle modes into one file, separating torchair and non-torchair modes #2773
#2776
opened Sep 5, 2025 by
weisirui-eng
Loading…
[main] addrmsnorm + quant fusion optim in Qwen Models
module:core
module:ops
#2772
opened Sep 5, 2025 by
rjg-lyh
Loading…
Install vllm from source to make doctest passed
documentation
Improvements or additions to documentation
module:tests
[Fix] Ensure metadata sync across DP ranks in eager mode
#2766
opened Sep 5, 2025 by
yiz-liu
Loading…
[main] mlp weight prefetch in Qwen Dense Models
merge-conflicts
module:core
module:ops
module:tests
#2762
opened Sep 4, 2025 by
rjg-lyh
Loading…
[main] add pd transfer for ascend scheduler
module:ops
module:tests
#2753
opened Sep 4, 2025 by
Liccol
Loading…
[Feat] communication optimization for mc2 ops on A2
module:ops
module:tests
#2752
opened Sep 4, 2025 by
realliujiaxu
Loading…
[Feat]support dynamic quantization in allgather
module:tests
#2747
opened Sep 4, 2025 by
WithHades
Loading…
[bugfix] fix deepseek rope sincoscache re-generation
module:ops
module:tests
#2744
opened Sep 4, 2025 by
zzzzwwjj
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.