Skip to content

Questions on deploying Quantized models ... #8213

Answered by kimishpatel
rvijayc asked this question in Q&A
Discussion options

You must be logged in to vote

So, when I use executorch partitioning, is it the expectation that we pattern match dequant -> opX -> quant for lowering into some supported fixed point primitive supported on the backend?

That is correct. However, there is some WIP to represent quantized ops via integer compute instead "dq -> op -> q". See here https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html#convert-the-calibrated-model-to-a-quantized-model

Suppose, I have a Python model of each fixed point op, is there any straightforward way I can run the executorch program directly on Python by substituting the python model for the corresponding lowered module? Since the graph schema is known, it should be possible to d…

Replies: 8 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by tarun292
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
help wanted Extra attention is needed triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: quantization Issues related to quantization
4 participants
Converted from issue

This discussion was converted from issue #1141 on February 05, 2025 17:03.