Skip to content

Commit a017147

Browse files
committed
Add descriptions for augmentation scaling algorithm
1 parent 2b7ebd2 commit a017147

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

pipelines/no_midi_preparation.ipynb

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1052,7 +1052,7 @@
10521052
"\n",
10531053
"##### `random_time_stretching`\n",
10541054
"\n",
1055-
"Once enabled, the speed of your data will be randomly changed when preprocessing. The ratio of the speed change will be emebedded into the networks, which allows you to control the frame-level speed or velocity (similar to but much more flexible than the VEL parameter in VOCALOID) at inference time. In other words, by applying global time stretching at training time, you gain the ability to apply local time stretching at inference time. This can be used to adjust the texture of consonants and the ratio of different parts of vowels. **Some audio segments will be longer after this augmentation is applied. Please be careful of your batch size and your GPU memory usage.**\n",
1055+
"Once enabled, the speed of your data will be randomly changed when preprocessing. The ratio of the speed change will be embedded into the networks, which allows you to control the frame-level speed or velocity (similar to but much more flexible than the VEL parameter in VOCALOID) at inference time. In other words, by applying global time stretching at training time, you gain the ability to apply local time stretching at inference time. This can be used to adjust the texture of consonants and the ratio of different parts of vowels. **Some audio segments will be longer after this augmentation is applied. Please be careful of your batch size and your GPU memory usage.**\n",
10561056
"\n",
10571057
"This type of augmentation accepts the following arguments:\n",
10581058
"- `range` controls the range of the speed changing ratio.\n",
@@ -1061,6 +1061,11 @@
10611061
"\n",
10621062
"$ D_{augmentation} \\approx (1 + scale \\cdot \\frac{1}{b - a} \\cdot \\int_{a}^{b} f(x) dx) \\cdot D_{original} $ , where $ a, b $ represents the range of the speed ratio, $ f(x) $ represents the PDF of the speed ratio.\n",
10631063
"\n",
1064+
"---\n",
1065+
"> When there are more than two types of augmentation enabled, a cascade and joint augmentation scaling algorithm is applied. Briefly speaking, the following rules will be satisfied after applying and combining multiple types of augmentation:\n",
1066+
"> 1. The number of data pieces applied with the $ k $th augmentation will be $ scale_{k} $ times than those not applied with the $ k $th augmentation.\n",
1067+
"> 2. The number of data pieces applied with at least one type of augmentation will be $ \\sum_{i = 1}^{n} scale_{i} $ times than those not applied with any augmentation (purly raw data).\n",
1068+
"\n",
10641069
"#### 4.2.3 Training and validating\n",
10651070
"\n",
10661071
"##### `test_prefixes`\n",

0 commit comments

Comments
 (0)