The nvidia/GLM-5-NVFP4model with NVFP4 quantization cannot be launched using the SGLang engine. #21129
BigCousin-z
started this conversation in
General
Replies: 1 comment
-
|
@BigCousin-z hi, now it is supported on main. You can try any nightly CU13 image like https://hub.docker.com/r/lmsysorg/sglang/tags here |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
https://huggingface.co/nvidia/GLM-5-NVFP4,It provides this reference example, like “To serve this checkpoint with SGLang, you can start the docker lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732 and run the sample command below:”, but I cannot find this image.
Where can I find this version of GLM-5 NVFP4 that works with the SGLang engine? Like Image:lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732?????
Beta Was this translation helpful? Give feedback.
All reactions