Cannot get execute_services to work no matter which model I've tried, only conversation is possible

I cannot seem to get this extension to work no matter which local LLM model I've tried - with "Use Tools" on, off and many different prompt variations (though mostly using the default prompt).

The models I've tried include:
Mistral-7B-Instruct-v0.2-GGUF
Qwen2.5-7B-Instruct-Q6_K_L
Qwen3-30B-A3B-Q8_0
gemma-2-27B-it-function-calling-Q6_K
Hermes-3-Llama-3.1-8B.Q8_0

Pretty much all as .gguf files.

This is the invocation I am using for `llama-server` for the last model:
`LLAMA_KV_OVERALLOC=2.0 LLAMA_CHAT_TEMPLATE=qwen:tool_use ./bin/llama-server   --host 0.0.0.0   --port 8000   --jinja   --chat-template-file <(python ../scripts/get_chat_template.py NousResearch/Hermes-3-Llama-3.1-8B tool_use)   -m ~/dev/hermes/Hermes-3-Llama-3.1-8B.Q8_0.gguf   -v   --log-timestamps`

I always get a response, sometimes in an XML-like format, sometimes as plain conversation, and I've even confirmed with ChatGPT that some of my responses include tool calls, but no matter what, my lights do not turn off, my scenes do not trigger, etc.

Appreciate any help or advice anyone has!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot get execute_services to work no matter which model I've tried, only conversation is possible #326

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cannot get execute_services to work no matter which model I've tried, only conversation is possible #326

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions