Skip to content

Conversation

@mergennachin
Copy link
Contributor

@mergennachin mergennachin commented Feb 3, 2026

model.pte is 719.2 MB

Runtime output:

(executorch_dev) mnachin@mnachin-mbp executorch % ./cmake-out/examples/models/parakeet/parakeet_runner \
  --model_path examples/models/parakeet/parakeet_quantized_xnnpack/model.pte \
  --audio_path output.wav \
  --tokenizer_path examples/models/parakeet/parakeet_quantized_xnnpack/tokenizer.model
I tokenizers:regex.cpp:27] Registering override fallback regex
E tokenizers:hf_tokenizer.cpp:82] Error parsing json file: [json.exception.parse_error.101] parse error at line 2, column 1: syntax error while parsing value - invalid literal; last read: '<U+000A><U+000E>'
E tokenizers:tiktoken.cpp:59] invalid tiktoken line:
Transcribed text: mister Quilter is the apostle of the middle classes, and we are glad to welcome his gospel. Nor is Mr. Quilter's manner less interesting than his matter. He tells us that at this festive season of the year, with Christmas and roast beef looming before us, similes drawn from eating and its results occur most readily to the mind. He has grave doubts whether Sir Frederick Leighton's work is really Greek after all, and can discover

Segment timestamps:
0.24s - 5.6s : mister Quilter is the apostle of the middle classes, and we are glad to welcome his gospel.
6.24s - 6.96s : Nor is Mr.
7.12s - 10.24s : Quilter's manner less interesting than his matter.
11.04s - 22.96s : He tells us that at this festive season of the year, with Christmas and roast beef looming before us, similes drawn from eating and its results occur most readily to the mind.
23.44s - 29.76s : He has grave doubts whether Sir Frederick Leighton's work is really Greek after all, and can discover

@mergennachin mergennachin requested a review from lucylq as a code owner February 3, 2026 22:58
Copilot AI review requested due to automatic review settings February 3, 2026 22:58
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 3, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17175

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 22e744b with merge base eee5d96 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 3, 2026
@github-actions
Copy link

github-actions bot commented Feb 3, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for dynamic quantization (8da4w) on the XNNPACK backend for the Parakeet TDT speech recognition model.

Changes:

  • Adds HQQ (Half-Quadratic Quantization) scale-only algorithm for 8da4w quantization configuration
  • Enables XNNPACK backend to handle both dynamically quantized operations and remaining floating-point operations using dual partitioners
  • Adds documentation and examples for using dynamic quantization with XNNPACK

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
examples/models/parakeet/quantize.py Adds intx_choose_qparams_algorithm="hqq_scale_only" parameter to 8da4w quantization config for improved quantization quality with grouped quantization
examples/models/parakeet/export_parakeet_tdt.py Imports and uses XnnpackDynamicallyQuantizedPartitioner alongside XnnpackPartitioner for handling dynamic quantization ops; enables quantization fusion and constant propagation
examples/models/parakeet/README.md Adds documentation and example command for using 8da4w dynamic quantization with XNNPACK backend

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants