feat: support zero-point decompression for asymmetric quantization (packed) #463

Etelis · 2025-09-11T14:55:19Z

Follow-up to PR #459 (which was closed due to an accidental history rewrite). The branch has been restored to the pre-rewrite tip (2cf7124).

This PR:

Implements zero-point decompression for asymmetric quantization in the packed compressor
Adds tests (in-memory decompress_model, calibration fixtures)
Uses a standard-deviation-based similarity threshold for reconstruction

Refs vllm-project/llm-compressor#1704.

- Fix decompress_weight method in PackedQuantizationCompressor to support unpacking zero-points - Add comprehensive tests for zero-point packing/unpacking with GROUP and CHANNEL strategies - Add end-to-end integration tests for asymmetric quantization workflow - Ensure packed tensors are contiguous for safetensors compatibility Resolves issue referenced in vllm-project/llm-compressor#1704

…move manual creation

…v similarity; cleanup temp usage

Etelis · 2025-09-11T15:00:49Z

@dsikka, @brian-dellabetta,

I've addressed all your concerns.
Sorry about that.

dsikka

Overall looks good!

Some questions about the e2e test but otherwise, looks great!

tests/test_compressors/quantized_compressors/test_packed_asym_decompression.py

brian-dellabetta

This looks good pending @dsikka 's comments, thanks for updating!

rahul-tuli

LGTM pending @dsikka 's comments! Thank you for this contribution

dsikka · 2025-10-14T21:18:05Z

Hi @Etelis - do you think you'll get chance to address the remaining comments to get this PR over the line?

Etelis · 2025-10-15T01:42:04Z

Yes yes sorry.
I'm on a vacation til the 19th then by the 20th I will finish.
Promised!

Etelis · 2025-10-20T13:13:46Z

@dsikka

Addressed review comments:

Switched to in-memory methods: Replaced compressor.compress() / compressor.decompress() (disk-based) with ModelCompressor.compress_model() / decompress_model() (in-memory).

Removed manual parameter registration: The manual register_parameter() logic is no longer needed since compress_model() / decompress_model() handle parameter management internally.

all 4 test cases pass

brian-dellabetta

Thanks for the contribution! I think these changes make sense. Do checkpointed models load in vllm?

brian-dellabetta · 2025-10-22T20:03:19Z

Please run make style / make quality

dsikka

Thank you!
Do you mind running make style and make quality to address the quality issues?
We can land after those are addressed

Signed-off-by: Brian Dellabetta <[email protected]>

brian-dellabetta

I ran style fixes

brian-dellabetta

resolved merge conflict

Signed-off-by: Brian Dellabetta <[email protected]>

Etelis added 5 commits September 9, 2025 19:12

nit: assert zero_point exists for asymmetric strategies before unpacking

bd1d083

tests: rely on apply_quantization_config to init scale/zero-point; re…

281f1c6

…move manual creation

tests: rename to test_packed_asym_decompression.py

c0cbb70

tests: use in-memory decompress_model; calibrate via fixtures; std-de…

126fc89

…v similarity; cleanup temp usage

Etelis force-pushed the fix/zero-point-decompression branch from 2cf7124 to 126fc89 Compare September 11, 2025 14:57

dsikka reviewed Sep 13, 2025

View reviewed changes

brian-dellabetta reviewed Sep 16, 2025

View reviewed changes

rahul-tuli reviewed Sep 17, 2025

View reviewed changes

refactor: use in-memory compress/decompress methods

31fe0cf

brian-dellabetta previously approved these changes Oct 20, 2025

View reviewed changes

dsikka previously approved these changes Oct 23, 2025

View reviewed changes

stylefix

3ffb213

Signed-off-by: Brian Dellabetta <[email protected]>

brian-dellabetta dismissed stale reviews from dsikka and themself via 3ffb213 October 23, 2025 17:36

brian-dellabetta previously approved these changes Oct 23, 2025

View reviewed changes

Merge branch 'main' into fix/zero-point-decompression

afcd6a7

brian-dellabetta dismissed their stale review via afcd6a7 October 23, 2025 17:37

brian-dellabetta previously approved these changes Oct 23, 2025

View reviewed changes

style fixes

2b70136

Signed-off-by: Brian Dellabetta <[email protected]>

brian-dellabetta dismissed their stale review via 2b70136 October 23, 2025 18:00

brian-dellabetta added 3 commits October 23, 2025 13:04

style fixes

e157827

Signed-off-by: Brian Dellabetta <[email protected]>

style fixes

76daf28

Signed-off-by: Brian Dellabetta <[email protected]>

style fixes

fc4afa1

Signed-off-by: Brian Dellabetta <[email protected]>

brian-dellabetta approved these changes Oct 23, 2025

View reviewed changes

feat: support zero-point decompression for asymmetric quantization (packed) #463

Are you sure you want to change the base?

feat: support zero-point decompression for asymmetric quantization (packed) #463

Conversation

Etelis commented Sep 11, 2025

Uh oh!

Etelis commented Sep 11, 2025

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

rahul-tuli left a comment

Choose a reason for hiding this comment

Uh oh!

dsikka commented Oct 14, 2025

Uh oh!

Etelis commented Oct 15, 2025

Uh oh!

Etelis commented Oct 20, 2025

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

brian-dellabetta commented Oct 22, 2025

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants