|
1076 | 1076 | "\n", |
1077 | 1077 | "These two parameters jointly determine the batch size at training time, the former representing maximum number of frames in one batch and the latter limiting the maximum batch size. Larger batches consumes more GPU memory at training time. This value can be adjusted according to your GPU memory. Remember not to set this value too low because the model may not converge with small batches.\n", |
1078 | 1078 | "\n", |
1079 | | - "##### `lr` and `decay_steps`\n", |
| 1079 | + "##### `lr`, `decay_steps`, `gamma`\n", |
1080 | 1080 | "\n", |
1081 | | - "These two values refer to the learning rate and number of steps everytime the learning rate decays. If you decreased your batch size, you may consider using a smaller learning rate and more decay steps.\n", |
| 1081 | + "The learning rate starts at `lr`, decays with the rate `gamma` at every `decay_steps` during training. If you decreased your batch size, you may consider using a smaller learning rate and more decay steps, or larger gamma.\n", |
1082 | 1082 | "\n", |
1083 | 1083 | "##### `val_check_interval`, `num_ckpt_keep` and `max_updates`\n", |
1084 | 1084 | "\n", |
|
1137 | 1137 | "\n", |
1138 | 1138 | "lr = 0.0004\n", |
1139 | 1139 | "decay_steps = 50000\n", |
| 1140 | + "gamma = 0.5\n", |
1140 | 1141 | "\n", |
1141 | 1142 | "val_check_interval = 2000\n", |
1142 | 1143 | "num_ckpt_keep = 5\n", |
|
1185 | 1186 | " 'max_sentences': max_sentences,\n", |
1186 | 1187 | " 'lr': lr,\n", |
1187 | 1188 | " 'decay_steps': decay_steps,\n", |
| 1189 | + " 'gamma': gamma,\n", |
1188 | 1190 | " 'val_check_interval': val_check_interval,\n", |
1189 | 1191 | " 'num_valid_plots': min(10, len(test_prefixes)),\n", |
1190 | 1192 | " 'num_ckpt_keep': num_ckpt_keep,\n", |
|
1411 | 1413 | "\n", |
1412 | 1414 | "lr = 0.0004\n", |
1413 | 1415 | "decay_steps = 50000\n", |
| 1416 | + "gamma = 0.5\n", |
1414 | 1417 | "\n", |
1415 | 1418 | "val_check_interval = 2000\n", |
1416 | 1419 | "num_ckpt_keep = 5\n", |
|
1485 | 1488 | " 'max_sentences': max_sentences,\n", |
1486 | 1489 | " 'lr': lr,\n", |
1487 | 1490 | " 'decay_steps': decay_steps,\n", |
| 1491 | + " 'gamma': gamma\n", |
1488 | 1492 | " 'val_check_interval': val_check_interval,\n", |
1489 | 1493 | " 'num_valid_plots': min(20, len(test_prefixes)),\n", |
1490 | 1494 | " 'num_ckpt_keep': num_ckpt_keep,\n", |
|
0 commit comments