Hi, thank you for your great work.
Can you please release the setup for HuggingFace model benchmarking? We tested qwen and llama (locally deployed), but got poor results. We ran into the issue where our local model performed bad but when we call API for the same model it performed well.
Thank you!