-
Notifications
You must be signed in to change notification settings - Fork 3.3k
[magpietts] added multiple validation dataloaders and log metrics per val data #15348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[magpietts] added multiple validation dataloaders and log metrics per val data #15348
Conversation
861e8b3 to
fae5fcb
Compare
…VIDIA-NeMo#15189) * added multiple validation dataloaders and log metrics per val data. * Apply suggestion from @XuesongYang Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Apply suggestion from @Copilot Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Apply suggestion from @Copilot Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Apply suggestion from @Copilot Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> --------- Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ation to on_validation_epoch_end. Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
fae5fcb to
c9cc855
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds support for validating MagpieTTS on multiple datasets (multiple validation dataloaders) while improving how media artifacts (audio + attention visualizations) are prepared and logged to W&B/TensorBoard, and updates the example Lhotse config to the new dataset configuration structure.
Changes:
- Refactors validation media logging by separating data preparation (numpy arrays) from logger-specific emission (W&B/TB objects).
- Adds multi-dataloader validation support, including per-dataloader metric aggregation and an averaged validation loss for checkpointing.
- Updates the MagpieTTS Lhotse example config to remove the
dataset:nesting and introduce avalidation_ds.datasetslist format.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| nemo/collections/tts/models/magpietts.py | Implements multi-validation-dataloader handling, refactors media logging, and adjusts Lhotse dataloader config expectations. |
| examples/tts/conf/magpietts/magpietts_lhotse.yaml | Updates example configuration to match the new train/validation dataset config structure and multi-val datasets list format. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
…exists in val ds config Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Summary
Details
where, the yaml config for validation datasets looks like below, which is apt to generalize to multiple languages datasets.
wandb log see here: https://wandb.ai/aiapps/debug_magpieTTS_EN_2509/runs/bqerks4y?nw=nwuserxuesong_yang
The model yaml config that were previously under
train_ds.datasetare now directly under 'train_ds'.The configuration structure for
validation_dschanges: