Skip to content

Batch size and training steps #3

@jhao6

Description

@jhao6

Hi,

I have two questions regarding your code:

  1. What is the bath size for the pretraining? I found that it is set to 512 in the table 5 in the original paper, but it is 64 and the micro batch size is 16 in the code.
  2. How many pretraining steps are taken in Table 1, 10k or 50k? If we use more pretraining steps, like 50k steps, how many steps are taken in the first pretraining stage on the random-selected data and how many steps are taken in the last pretraining stage on selected data.
    Hope you can help me to figure them out. Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions