-
-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Hey!
Been using your software for some days now and am very happy. It's very thoughtful in many aspects and every time I think "ah, that won't work, it would need to be implemented" and then I'm checking your docs and realize "oh, it's there. Nice!". So thanks for your work.
My current setup is mixing multiple machines (macOS, Windows and Linux) and therefore I'm mixing multiple backend solutions like llamacpp, ollama and lmstudio - depending on compatibility (Strix Halo is quite picky) and performance (macOS loves MLX and I'm using LMStudio instead of ollama, because there are native MLX model files). The problem I have with this setup is, that the model names - eventhough it's basically the same model - are quite different. llamacpp likes to use the file of the model, ollama likes to separate the parameter-size by a : (e.g. gpt-oss:120b) and lmstudio seems to always uses dashes (gpt-oss-120b).
It would be a very nice feature, if I can define model name aliases in the olla config.
Like
model_alias:
only_alias: true
models:
gpt-oss-120b:
- gpt-oss:120b
- gpt-oss-120b
- ggml-org_gpt-oss-120b-GGUF_gpt-oss-120b-mxfp4-00001-of-00003.ggufso that when I send a /v1/chat/completions request with gpt-oss-120b model, every node with one of the defined models can be used (depending on priority etc.). In addition, the only_alias: true setting could keep things clean in /v1/models and show only model aliases instead of every model that can be found on any device.
Tell me what you think :)