Releases: intunist/nnsvs-english-support
v0.4.0: Cleanup, phoneme changes, and a (good) Dictonary
v0.4.0 overhauls things a little bit and cleans a lot up. This update also breaks compatibility with older versions but is fairly easy to update to the new version!
- Our custom [j] and [h] phonemes have been reverted back to the standard Arpabet [jh] and [hh] phonemes. This is the one breaking change and should be the last phonetic change unless we add phonemes.
- the extra/unused phonemes have been removed from the hed files to reduce the amount of vram needed for training.
- There is now a (GOOD) dictionary included! This dictionary is based on amepd, a modified cmu dict, by Reece H. Dunn.
The dictionary will be a constant work in progress to make it more accurate for singing. We found a lot of weird quirks in it but it's a massive first step in getting a good reference for new users.
v0.3.1: removed [eng] phoneme
v0.3.1 removes the [eng] phoneme from the list of supported phonemes.
It ended up being useless in all cases as [ih][ng] could (and should) be used instead.
v0.3.0: rrrrrrrr and disposable phonemes
v0.3.0 adds three new phonemes, [rr], [rx], and [ol]. Adding more flexibility when labeling datasets.
- [rr] is the trilled r, like in Spanish.
- [rx] is a fricative r, like in German or French.
- [ol] is a particularly useful phoneme. It's intended for labeling out-of-language or "junk" sounds.
^This allows you to label around sounds you don't want affecting your models.
We are also now working on a shiro model for auto-labeling datasets. But that's a ways off!
Be sure to update the in_dim values in your config!
v0.2.4: voiced release, documentation, and dictionary improvements
v0.2.3 v0.2.4 adds a new phoneme, [axh]. This is similar to the [exh] phoneme but for labeling voiced releases and exhales.
Along with this update:
- all the documentation was updated for improved clarity.
- The dictionary has been improved further, almost to the point of being useful.
- square brackets in the dictionary have been replaced with inequality signs/angled brackets
<>for compatibility reasons.
Be sure to change the in_dim value in your config files.
v0.2.2: Updated Dictionary
v0.2.2: Updated the tables to be more useful. You will still need to train with blank.table but english/english1.table can be used afterwards on the finished model if desired, Note that ENUNU doesn't support multi-syllable words at the time of writing.
Some documentation was also updated to make more sense.
v0.2.1: suffix support
v0.2.1 Adds support for suffixed phonemes! ...with come caveats.
Note that the additional phonemes in the suffixed hed file increases the amount of vram/ram required for training. So you may wish to omit suffixes you don't need.
additional changes:
- added [cr] phoneme: for labeling vocal cracks, if desired.
- in_dim values are different from previous versions. Refer to README.md.
- [cl] phoneme is now treated as a "toggle" for when the closure state of a consonant in a cluster isn't the vocalist's natural default. Still unlikely to use.