WIP add json-schema grammar test from llama.cpp#1
WIP add json-schema grammar test from llama.cpp#1marcnnn wants to merge 3 commits intoseanmor5:mainfrom
Conversation
This commit adds tests for all grammar from https://github.com/ggerganov/llama.cpp/blob/master/tests/test-json-schema-to-grammar.cpp Many fail because the regex operator to repeat {from,to} many times seems not implemented. And some other problems.
|
@marcnnn Thank you for including these! I can look at adding This library is unfinished, but the idea was to be able to parse EBNF grammars into a state that could basically be used as a state machine during sampling. There are some challenges in Nx specifically that make this more difficult, specifically that each input to a generation algorithm needs to be a tensor with a static shape. I haven't had time to look at this problem in awhile, but I can assist if it's something you're interested in working on! I originally took inspiration from a PR in the huggingface repo; however, I think there are other approaches that are worth exploring from vllm and elsewhere. I am in the EEF ML Slack if you want to ping me there to chat about how we can achieve this |
43de27c to
c639423
Compare
This commit adds tests for all grammar from
https://github.com/ggerganov/llama.cpp/blob/master/tests/test-json-schema-to-grammar.cpp
Many fail because the regex operator to repeat
{from,to} many times seems not implemented.
And some other problems.
@seanmor5 Thanks for all your work!
I would love to help with elixir-nx/bumblebee#354, but to get into understanding the difficulties I started here.
To get to the goal of Constrained sampling with rich JSON-schema support for tool calling in Elixir.
Since you have no License on the Project, is adding the grammars from the llama.cpp (MIT License) code okay for you?