Skip to content

Pull requests: mlc-ai/mlc-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Bump flashinfer-python CUDA 13
#3327 opened Sep 7, 2025 by johnnynunez Loading…
NUMA-aware tensor parallelism for CPU inference
#3320 opened Aug 30, 2025 by MagellaX Loading…
Add sequence padding to BeginForward
#3314 opened Aug 25, 2025 by joshua-j-hong Loading…
[Model] Updated model preset with more models
#3313 opened Aug 25, 2025 by harrywhoo Loading…
Fix supported platforms
#3298 opened Aug 3, 2025 by zxcat Loading…
Add API Key Authentication For openai_entrypoints
#3297 opened Aug 2, 2025 by rankaiyx Loading…
Add ArceeForCausalLM support
#3294 opened Jul 27, 2025 by bartowski1182 Loading…
Add Comprehensive QAT Training Framework for MLC-LLM
#3258 opened Jun 23, 2025 by alohachen Loading…
7 of 9 tasks
[Refactor] PagedKVCache spec for MLC-LLM
#3203 opened Apr 14, 2025 by annanyapr Loading…
Refactored random.h to have PhiloxRandomGenerator
#3181 opened Mar 18, 2025 by annanyapr Loading…
[Model] Qwen-2-VL Support
#3125 opened Feb 10, 2025 by nihalgeorge01 Draft
[Bench] Add support for multiple backend
#3037 opened Nov 20, 2024 by cyx-6 Draft
[Model] Add use_qk_norm option for Cohere model
#2877 opened Sep 2, 2024 by tlopex Loading…
[Serving] PagedKVCache Quantization
#2663 opened Jul 16, 2024 by davidpissarra Loading…
[Bench] Add bench for GSM8K eval
#2585 opened Jun 16, 2024 by Hzfengsy Loading…
[Bench] Add bench for MMLU eval
#2584 opened Jun 16, 2024 by Hzfengsy Loading…
Add docker container support
#1271 opened Nov 15, 2023 by Sing-Li Loading…
ProTip! Follow long discussions with comments:>50.