Release v2.5.1: Important bug fix and slight feature enhancement · openvpi/DiffSinger

Bug fixes

We are sorry that some features of multi-dictionary support has never worked correctly since its release. The previous preprocessing code had a bug in collecting cross-lingual phonemes, and it unexpectedly marked almost all phonemes as "not merged", making all language IDs to be only zeros. Thus, the language embedding, which was designed to distinguish phonemes from different languages in each merged group, was not training at all. What made it worse is that the code inside ONNX model treated language IDs correctly, but what it actually embeded into the model are some vectors that had never updated since their random initialization. We cannot investigate what negative impact the long-existing bug had brought to the model, but luckily the model "seemed" working well.

This bug has now been fixed and the new models with consistent training and inference showed no problem in internal tests, with some (unconfirmed) improvements in cross-lingual pitch prediction.

Other small bug fixes:

pitch_r2 metric was not working in 2.5.0
RoPE cache issue about find_unused_parameters in DDP training (#244)
Some variable unexpectedly got float64 dtype with NumPy 2.x

Other changes and improvements

The binarizers now accept FLAC format (WAV is still preferred)
Default vocoder package in generated dsconfig.yaml is switched to pc_nsf_hifigan_44.1k_hop512_128bin_2025.02

See full change log: v2.5.0...v2.5.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.5.1: Important bug fix and slight feature enhancement

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Bug fixes

Other changes and improvements

Uh oh!