Skip to content

Conversation

@Etelis
Copy link

@Etelis Etelis commented Sep 11, 2025

Follow-up to PR #459 (which was closed due to an accidental history rewrite). The branch has been restored to the pre-rewrite tip (2cf7124).

This PR:

  • Implements zero-point decompression for asymmetric quantization in the packed compressor
  • Adds tests (in-memory decompress_model, calibration fixtures)
  • Uses a standard-deviation-based similarity threshold for reconstruction

Refs vllm-project/llm-compressor#1704.

- Fix decompress_weight method in PackedQuantizationCompressor to support unpacking zero-points
- Add comprehensive tests for zero-point packing/unpacking with GROUP and CHANNEL strategies
- Add end-to-end integration tests for asymmetric quantization workflow
- Ensure packed tensors are contiguous for safetensors compatibility

Resolves issue referenced in vllm-project/llm-compressor#1704
@Etelis Etelis force-pushed the fix/zero-point-decompression branch from 2cf7124 to 126fc89 Compare September 11, 2025 14:57
@Etelis
Copy link
Author

Etelis commented Sep 11, 2025

@dsikka, @brian-dellabetta,

I've addressed all your concerns.
Sorry about that.

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good!

Some questions about the e2e test but otherwise, looks great!

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good pending @dsikka 's comments, thanks for updating!

Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending @dsikka 's comments! Thank you for this contribution

@dsikka
Copy link
Collaborator

dsikka commented Oct 14, 2025

Hi @Etelis - do you think you'll get chance to address the remaining comments to get this PR over the line?

@Etelis
Copy link
Author

Etelis commented Oct 15, 2025

Yes yes sorry.
I'm on a vacation til the 19th then by the 20th I will finish.
Promised!

@Etelis
Copy link
Author

Etelis commented Oct 20, 2025

@dsikka

Addressed review comments:

Switched to in-memory methods: Replaced compressor.compress() / compressor.decompress() (disk-based) with ModelCompressor.compress_model() / decompress_model() (in-memory).

Removed manual parameter registration: The manual register_parameter() logic is no longer needed since compress_model() / decompress_model() handle parameter management internally.

all 4 test cases pass

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! I think these changes make sense. Do checkpointed models load in vllm?

@brian-dellabetta
Copy link
Collaborator

Please run make style / make quality

dsikka
dsikka previously approved these changes Oct 23, 2025
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
Do you mind running make style and make quality to address the quality issues?
We can land after those are addressed

Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta dismissed stale reviews from dsikka and themself via 3ffb213 October 23, 2025 17:36
Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran style fixes

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved merge conflict

Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants