Skip to content

Commit

Permalink
Support AWQ quantization with bias
Browse files Browse the repository at this point in the history
When the AWQ quantizer was used with a layer that uses a bias,
the bias tensor was not correctly passed/used. Instead, the
value `true`/`1.0` was added to the linear transformation.

Correctly pass through the bias when it is not `None`.

Fixes #2106.
  • Loading branch information
danieldk committed Jun 25, 2024
1 parent fc9c315 commit f50c401
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def __init__(self, w_bit, group_size, qweight, qzeros, scales, bias):
self.qweight = qweight
self.qzeros = qzeros
self.scales = scales
if bias:
if bias is not None:
self.bias = bias
else:
self.bias = None
Expand Down
2 changes: 1 addition & 1 deletion server/text_generation_server/layers/linear.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ def get_linear(weight, bias, quantize):
qweight=weight.qweight,
qzeros=weight.qzeros,
scales=weight.scales,
bias=bias is not None,
bias=bias,
)
except ImportError:
raise NotImplementedError(
Expand Down

0 comments on commit f50c401

Please sign in to comment.