-
Notifications
You must be signed in to change notification settings - Fork 839
Fix KeyError in InsertIOQDQ pass for LLM quantization #17194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
5ebb788
c932c65
5249111
621beaf
4e3e05a
f8cfad5
b3964d7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,11 +31,16 @@ class InsertIOQDQ(ExportPass): | |
| """ | ||
|
|
||
| q_dq_map = { | ||
| # per tensor | ||
| # per tensor (quantize -> dequantize) | ||
| exir_ops.edge.quantized_decomposed.quantize_per_tensor.default: exir_ops.edge.quantized_decomposed.dequantize_per_tensor.tensor, | ||
| exir_ops.edge.quantized_decomposed.quantize_per_tensor.tensor: exir_ops.edge.quantized_decomposed.dequantize_per_tensor.tensor, | ||
| # per channel | ||
| # per tensor (dequantize -> dequantize, for nodes with dequantize encoding) | ||
| exir_ops.edge.quantized_decomposed.dequantize_per_tensor.default: exir_ops.edge.quantized_decomposed.dequantize_per_tensor.default, | ||
| exir_ops.edge.quantized_decomposed.dequantize_per_tensor.tensor: exir_ops.edge.quantized_decomposed.dequantize_per_tensor.tensor, | ||
| # per channel (quantize -> dequantize) | ||
| exir_ops.edge.quantized_decomposed.quantize_per_channel.default: exir_ops.edge.quantized_decomposed.dequantize_per_channel.default, | ||
| # per channel (dequantize -> dequantize, for nodes with dequantize encoding) | ||
| exir_ops.edge.quantized_decomposed.dequantize_per_channel.default: exir_ops.edge.quantized_decomposed.dequantize_per_channel.default, | ||
|
Comment on lines
+37
to
+43
|
||
| } | ||
|
|
||
| def __init__(self, edge_program: torch.export.ExportedProgram): | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding dequantize ops as keys in
q_dq_mapchanges_create_node()behavior: it checksif target in self.q_dq_mapto decide when to popQCOM_QUANT_ATTRSand castmeta['val']to the quantized dtype. After this change, inserted dequantize nodes (e.g.dequantize_per_tensor.tensor/dequantize_per_channel.default) will now satisfy that condition, causing theirmeta['val']dtype to be incorrectly cast to the quantized dtype and movingQCOM_QUANT_ATTRSoff the original node. The special-case should apply only to quantize ops; consider switching the check toif target in q_ops(or an explicit quantize-op set) so output dequant nodes keep floatmeta['val']and don’t steal the original node’s quant metadata.