mlc-ai / mlc-llm Public

Notifications You must be signed in to change notification settings
Fork 1.8k
Star 21.3k

Code
Issues 297
Pull requests 23
Actions
Projects 2
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: mlc-ai/mlc-llm

Labels 13 Milestones 0

New pull request New

23 Open 1,704 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Bump flashinfer-python CUDA 13

#3327 opened Sep 7, 2025 by johnnynunez

Loading…

NUMA-aware tensor parallelism for CPU inference

#3320 opened Aug 30, 2025 by MagellaX

Loading…

Add sequence padding to BeginForward

#3314 opened Aug 25, 2025 by joshua-j-hong

Loading…

[Model] Updated model preset with more models

#3313 opened Aug 25, 2025 by harrywhoo

Loading…

Fix supported platforms

#3298 opened Aug 3, 2025 by zxcat

Loading…

Add API Key Authentication For openai_entrypoints

#3297 opened Aug 2, 2025 by rankaiyx

Loading…

Add ArceeForCausalLM support

#3294 opened Jul 27, 2025 by bartowski1182

Loading…

LoRA Adapter Integration for MLC-LLM: Complete Runtime Support and Compilation Pipeline

#3281 opened Jul 11, 2025 by MagellaX

Loading…

Fix: Resolve pylint import errors and other warnings

#3265 opened Jun 27, 2025 by Mirza-Samad-Ahmed-Baig

Loading…

Add Comprehensive QAT Training Framework for MLC-LLM

#3258 opened Jun 23, 2025 by alohachen

Loading…

7 of 9 tasks

Perf: load weights, create KV cache, initialize tokenizer in parallel

#3215 opened Apr 27, 2025 by Bekaboo

Loading…

[Refactor] PagedKVCache spec for MLC-LLM

#3203 opened Apr 14, 2025 by annanyapr

Loading…

[Serving] Support tool function calls under strict format constraints

#3190 opened Mar 26, 2025 by Irfnfnkemed

Loading…

Refactored random.h to have PhiloxRandomGenerator

#3181 opened Mar 18, 2025 by annanyapr

Loading…

[Model] Qwen-2-VL Support

#3125 opened Feb 10, 2025 by nihalgeorge01 • Draft

[CPP_CLI] MLC Cli App over JSONEngine interface

#3114 opened Jan 30, 2025 by srkreddy1238

Loading…

[Bench] Add support for multiple backend

#3037 opened Nov 20, 2024 by cyx-6 • Draft

[SERVE][CPP][Android] add native executable program to benchmark models

#2987 opened Oct 18, 2024 by pfk-beta

Loading…

[Model] Add use_qk_norm option for Cohere model

#2877 opened Sep 2, 2024 by tlopex

Loading…

[Serving] PagedKVCache Quantization

#2663 opened Jul 16, 2024 by davidpissarra

Loading…

[Bench] Add bench for GSM8K eval

#2585 opened Jun 16, 2024 by Hzfengsy

Loading…

[Bench] Add bench for MMLU eval

#2584 opened Jun 16, 2024 by Hzfengsy

Loading…

Add docker container support

#1271 opened Nov 15, 2023 by Sing-Li

Loading…

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!