Skip to content

What's the learning rate of FLatten-DeiT? #35

@atti0127

Description

@atti0127

Is it 5e-4 or 1e-3?

It is mentioned in paper "The basic learning rate for a batch size of 1024 is set to 1 × 10^(-3)", so if it follow DeiT recipe, than true lr of batch size 1024 is 2× 10^(-3)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions