Skip to content

Conversation

@andrea-fasoli
Copy link
Collaborator

Contributing first unit test for INT8 and clean up.
Quantization configuration tested: weights per-tensor, activation per-tensor, no smoothquant

Related issue number

n/a

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

Signed-off-by: Andrea Fasoli <[email protected]>
Signed-off-by: Andrea Fasoli <[email protected]>
@chichun-charlie-liu chichun-charlie-liu changed the title Unit test int8 test: Unit test int8 Feb 14, 2025
@github-actions github-actions bot added the test label Feb 14, 2025
scale_x = 127 / a_cv
x_int = torch.round(x / sq * scale_x).clamp(-127, 127)
return x_int / scale_x * sq
x_int = torch.round(x / sq * scale_x).clamp(-127, 127).to(torch.int8)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this type casting really necessary? seems like the next line will apply a division with a float, which seems to automatically upcast again?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not needed. I added the casting during debugging, as I was observing numerical discrepancies between this function output and a reference output. I will remove it

@andrea-fasoli
Copy link
Collaborator Author

The purpose of this unit test is to compare the output of the custom int8 op for AIU against a reference, in order to ensure the correctness of the operation if in the future it gets altered.
The operation consists of:

  1. unpacking of qdata tensor containing all quantization metadata
  2. dequantization of integer weights
  3. quantization and dequantization of input activations
  4. matmul between dequantized weights and dequantized activations

However, I found the output of this operation to be very sensitive to the quantization process, to the point that changing order of divisions or multiplications (nominally equivalent but different in practice due to the precision used), would result in a failed test. I am not sure yet how to set a meaningful threshold for passing this test.

Signed-off-by: chichun-charlie-liu <[email protected]>
@chichun-charlie-liu chichun-charlie-liu merged commit 02f5ff3 into foundation-model-stack:main Feb 19, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants