Skip to content

Pretraining with FlexBert but can't export to HF #234

@asafam

Description

@asafam

Hi @warner-benjamin and team,
Thank you for your fantastic work on ModernBERT and FlexBert!

I’ve trained a FlexBert model using your codebase and I’m now trying to convert it to Hugging Face Transformers format for easier sharing and downstream use. I found the convert_to_hf.py script referenced in some previous issues and used it, along with the model definitions from this YAML.

The model loads correctly with AutoModel and AutoTokenizer, but when I run a forward pass, the outputs are all NaN.

from transformers import AutoModel, AutoConfig

model_path = "</path/to/model/from/convert_to_hf>"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModel.from_pretrained(model_path)
inputs = tokenizer("Hello world", return_tensors="pt")
with torch.no_grad():
    outputs = test_model(**inputs) # <- returns all NaN

Do you have any advice or best practices for converting models to HF format, or tips on what might be causing this issue? Any guidance would be greatly appreciated!

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions