Skip to content

Commit d51694a

Browse files
authored
[2/N][Refactor][Quantization] clean quantization patch (#2785)
### What this PR does / why we need it? quantization patch is unused code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? tested by CI - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@f4962a6 Signed-off-by: 22dimensions <[email protected]>
1 parent cd88f89 commit d51694a

File tree

4 files changed

+2
-456
lines changed

4 files changed

+2
-456
lines changed

tests/ut/quantization/test_func_wrapper.py

Lines changed: 0 additions & 134 deletions
This file was deleted.

vllm_ascend/ops/vocab_parallel_embedding.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ def __init__(self,
9797

9898
if params_dtype is None:
9999
params_dtype = torch.get_default_dtype()
100+
self.params_dtype = params_dtype
100101
# Divide the weight matrix along the vocaburaly dimension.
101102
self.num_added_embeddings = self.num_embeddings - self.org_vocab_size
102103
self.num_embeddings_per_partition = divide(self.num_embeddings_padded,

vllm_ascend/quantization/func_wrapper.py

Lines changed: 0 additions & 184 deletions
This file was deleted.

0 commit comments

Comments
 (0)