Skip to content

Conversation

brian-dellabetta
Copy link
Contributor

@brian-dellabetta brian-dellabetta commented Aug 21, 2025

In order to support multi-modifier recipes (e.g. AWQ+W4A16 on self_attn layers and FP8_DYNAMIC on mlp layers), quantization config and status must be applied only to the modules scoped to the modifier, not all at once. This updates apply_quantization_config so that quantization_config and quantization_status are applied just to the target modules, not changed globally across all modules.

In order for proper target prioritization, apply_quantization_status is performed regardless of what the current status is for the model. Without these changes, test_target_prioritization will fail.

Other small changes:

  • Added a test_multi_apply_quantization_config to make sure the application of multiple quantization configs in series works correctly -- shapes are correct and unused parameters are correctly removed.
  • Drop override_quantization_status in favor of more general patch_attr.
  • Removed infer_quantization_status which is no longer meaningful at the model level. It is also no longer needed because module's current status isn't checked.
  • Added ALL_QPARAM_NAMES constant so that parameters related to quantization can be cleared from modules during init
  • Removed all references to "quant_method": "sparseml" in favor of "compressed-tensors"
  • Dropped usage of compress_quantized_weights and apply_quantization_status. We can remove compress_quantized_weights and references to it in examples/notebooks in a follow-up PR
  • Also updated tests to get rid of warnings:
tests/test_compressors/quantized_compressors/test_fp8_quant.py::test_quant_format[channel-None-sc2-zp2]
  /home/runner/work/compressed-tensors/compressed-tensors/tests/test_compressors/quantized_compressors/test_fp8_quant.py:78: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.detach().clone() or sourceTensor.detach().clone().requires_grad_(True), rather than torch.tensor(sourceTensor).
    "dummy.weight_scale": torch.tensor(sc, dtype=torch.float32),

Merge in conjunction with

@brian-dellabetta brian-dellabetta changed the title [Mulit-Modifier] Scoped apply quantization config [Multi-Modifier] Scoped apply quantization config Aug 21, 2025
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/scoped-quant-status branch 2 times, most recently from 03fb664 to 550c0ad Compare August 21, 2025 19:18
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/scoped-quant-status branch from f70aedb to 606f177 Compare August 21, 2025 19:38
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
@kylesayrs
Copy link
Contributor

kylesayrs commented Aug 25, 2025

FYI #428. Also touches some apply logic and adds more scheme merging

Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/scoped-quant-status branch from 24af65a to 8259cbb Compare August 28, 2025 17:00
Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/scoped-quant-status branch from 8259cbb to b515c1b Compare August 28, 2025 17:00
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
kylesayrs
kylesayrs previously approved these changes Sep 12, 2025
Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a warning is necessary if the schemes overwrite. Looks good to me

@brian-dellabetta brian-dellabetta force-pushed the bdellabe/scoped-quant-status branch from 92f8757 to d2903a1 Compare September 15, 2025 17:20
Signed-off-by: Brian Dellabetta <[email protected]>
kylesayrs
kylesayrs previously approved these changes Sep 15, 2025
Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

rahul-tuli
rahul-tuli previously approved these changes Sep 16, 2025
Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! LGTM! 🚀

Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/scoped-quant-status branch from f7239b1 to 01af659 Compare September 18, 2025 16:48
Signed-off-by: Brian Dellabetta <[email protected]>
kylesayrs
kylesayrs previously approved these changes Sep 18, 2025
Signed-off-by: Brian Dellabetta <[email protected]>
@dsikka dsikka merged commit dfd069b into main Sep 18, 2025
2 checks passed
@dsikka dsikka deleted the bdellabe/scoped-quant-status branch September 18, 2025 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants