-
Notifications
You must be signed in to change notification settings - Fork 141
Open
Description
Hi @warner-benjamin and team,
Thank you for your fantastic work on ModernBERT and FlexBert!
I’ve trained a FlexBert model using your codebase and I’m now trying to convert it to Hugging Face Transformers format for easier sharing and downstream use. I found the convert_to_hf.py script referenced in some previous issues and used it, along with the model definitions from this YAML.
The model loads correctly with AutoModel and AutoTokenizer, but when I run a forward pass, the outputs are all NaN.
from transformers import AutoModel, AutoConfig
model_path = "</path/to/model/from/convert_to_hf>"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModel.from_pretrained(model_path)
inputs = tokenizer("Hello world", return_tensors="pt")
with torch.no_grad():
outputs = test_model(**inputs) # <- returns all NaN
Do you have any advice or best practices for converting models to HF format, or tips on what might be causing this issue? Any guidance would be greatly appreciated!
Thanks in advance!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels