-
Notifications
You must be signed in to change notification settings - Fork 424
[2/N][Refactor][Quantization] clean quantization patch #2785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request provides a good cleanup by removing the unused quantization patching mechanism. The refactoring simplifies the codebase by deleting the func_wrapper.py
file, its corresponding tests, and the complex monkey-patching logic in utils.py
. Moving the necessary functionality from the patch directly into the AscendVocabParallelEmbedding
class is a solid improvement for maintainability. I have one suggestion to complete the cleanup.
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2785 +/- ##
==========================================
+ Coverage 72.61% 72.82% +0.20%
==========================================
Files 154 152 -2
Lines 21318 21077 -241
==========================================
- Hits 15481 15349 -132
+ Misses 5837 5728 -109
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: 22dimensions <[email protected]>
397fcb5
to
b052105
Compare
…2785) ### What this PR does / why we need it? quantization patch is unused code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? tested by CI - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@f4962a6 Signed-off-by: 22dimensions <[email protected]> Signed-off-by: 1Fire4 <[email protected]>
…2785) ### What this PR does / why we need it? quantization patch is unused code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? tested by CI - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@f4962a6 Signed-off-by: 22dimensions <[email protected]> Signed-off-by: 1Fire4 <[email protected]>
…2785) ### What this PR does / why we need it? quantization patch is unused code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? tested by CI - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@f4962a6 Signed-off-by: 22dimensions <[email protected]>
What this PR does / why we need it?
quantization patch is unused code
Does this PR introduce any user-facing change?
No
How was this patch tested?
tested by CI