The nvidia/GLM-5-NVFP4model with NVFP4 quantization cannot be launched using the SGLang engine. #21129

BigCousin-z · 2026-03-22T10:20:14Z

BigCousin-z
Mar 22, 2026

https://huggingface.co/nvidia/GLM-5-NVFP4，It provides this reference example, like “To serve this checkpoint with SGLang, you can start the docker lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732 and run the sample command below:”， but I cannot find this image.
Where can I find this version of GLM-5 NVFP4 that works with the SGLang engine? Like Image：lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732？？？？？

b8zhong · 2026-03-23T03:07:52Z

b8zhong
Mar 23, 2026
Collaborator

@BigCousin-z hi, now it is supported on main. You can try any nightly CU13 image like https://hub.docker.com/r/lmsysorg/sglang/tags here

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The nvidia/GLM-5-NVFP4model with NVFP4 quantization cannot be launched using the SGLang engine. #21129

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The nvidia/GLM-5-NVFP4model with NVFP4 quantization cannot be launched using the SGLang engine. #21129

Uh oh!

BigCousin-z Mar 22, 2026

Replies: 1 comment

Uh oh!

b8zhong Mar 23, 2026 Collaborator

BigCousin-z
Mar 22, 2026

b8zhong
Mar 23, 2026
Collaborator