Skip to content

Possibly low F1 when finetuning BERT base #44

@dpfried

Description

@dpfried

Hi Mandar,

When I finetune BERT base, I get an OntoNotes dev F1 of 73.69. I was wondering if this is within the variance that you saw for BERT base, or could there be some problem with my setup?

I'm using the requirements versions from requirements.txt (except with the MarkupSafe version changed to 1.1.1, #40, and psycopg2 changed to psycopg2-binary), and am training on a V100 32GB, with these commands:

python train.py train_bert_base
python evaluate.py train_bert_base

When evaluating your finetuned BERT base model on dev (python evaluate.py bert_base), I get an F1 of 74.05. This is closer to the 74.3 dev F1 number from Table 4, but should it match exactly? I'm wondering if there could be some difference in my setup which affects eval a bit but gets magnified during training.

Thanks,
Daniel

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions