Skip to content

AIMV2 as the encoder, unfreezing it and setting the learning rate to 2e-6 results in the LLaVA-NEXT model achieving a loss of 0 #21

@1359347500cwc

Description

@1359347500cwc

When using AIMV2 as the encoder, unfreezing it and setting the learning rate to 2e-6 leads to the LLaVA-NEXT model reaching a loss of 0 after 3000-4000 steps. The original paper kept the encoder frozen. Why is it not recommended to unfreeze it for training? If I decide to unfreeze it, what learning rate should I set?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions