feat(quantization): add ActivationRestrictedAsymmetric option#28237
Open
Rishi-Dave wants to merge 1 commit intomicrosoft:mainfrom
Open
feat(quantization): add ActivationRestrictedAsymmetric option#28237Rishi-Dave wants to merge 1 commit intomicrosoft:mainfrom
Rishi-Dave wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…t8 zero-point snapping
When extra_options={"ActivationRestrictedAsymmetric": True} is passed to
quantize_static (or a QDQ config), uint8 activation zero-points are snapped
to 0 when rmin >= 0 (e.g. post-ReLU tensors) or 128 when rmin < 0. Scale
is recomputed so the dequantized range still covers [rmin, rmax] without
clipping.
- quant_utils: add snap_zero_point_to_uint8() helper (~28 LOC)
- base_quantizer: parse ActivationRestrictedAsymmetric extra-option flag
- onnx_quantizer: apply snap after compute_scale_zp in calc_quant_params
(uint8, non-symmetric activations only)
- qdq_quantizer: same snap in QDQ calc_quant_params path
- quantize: document new option in all four extra_options docstrings
- test_symmetric_flag: add TestRestrictedAsymmetricFlag (3 test methods)
Refs microsoft#21398
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a new
ActivationRestrictedAsymmetricextra-option to the Pythonquantization tools. When enabled, uint8 activation zero-points are snapped
to either 0 (when
rmin >= 0, e.g. post-ReLU/Sigmoid tensors) or 128(when
rmin < 0). The scale is recomputed so the dequantized range stillcovers
[rmin, rmax]without clipping.This restricted asymmetric mode is required by some hardware accelerators
that only support these two zero-point values for uint8 quantization,
without requiring the full restriction to symmetric (zero-point = 128 for
all tensors).
Motivation and Context
Fixes #21398.
Existing options cover only fully symmetric (
ActivationSymmetric→zero-point fixed at 128) or unrestricted asymmetric. There was no mode
that picks the closer of {0, 128} per tensor based on its observed range.
Changes
quant_utils.py: newsnap_zero_point_to_uint8(rmin, rmax)helper.base_quantizer.py: parse newActivationRestrictedAsymmetricextra-option.onnx_quantizer.pyandqdq_quantizer.py: apply snap aftercompute_scale_zpin the activation path. Guarded onquant_type == UINT8 and not symmetric. Weight and int8 paths areuntouched.
quantize.py: document the new option in the fourextra_optionsdocstrings.
test_symmetric_flag.py: newTestRestrictedAsymmetricFlagcoveringthree cases (positive range → zp=0, signed range → zp=128, and
option-disabled regression).
Testing
```
python -m pytest onnxruntime/test/python/quantization/test_symmetric_flag.py -v
```
All 7 tests pass (4 existing + 3 new). `lintrunner` is clean.