Skip to content

Cannot get execute_services to work no matter which model I've tried, only conversation is possible #326

@zachfeldman

Description

@zachfeldman

I cannot seem to get this extension to work no matter which local LLM model I've tried - with "Use Tools" on, off and many different prompt variations (though mostly using the default prompt).

The models I've tried include:
Mistral-7B-Instruct-v0.2-GGUF
Qwen2.5-7B-Instruct-Q6_K_L
Qwen3-30B-A3B-Q8_0
gemma-2-27B-it-function-calling-Q6_K
Hermes-3-Llama-3.1-8B.Q8_0

Pretty much all as .gguf files.

This is the invocation I am using for llama-server for the last model:
LLAMA_KV_OVERALLOC=2.0 LLAMA_CHAT_TEMPLATE=qwen:tool_use ./bin/llama-server --host 0.0.0.0 --port 8000 --jinja --chat-template-file <(python ../scripts/get_chat_template.py NousResearch/Hermes-3-Llama-3.1-8B tool_use) -m ~/dev/hermes/Hermes-3-Llama-3.1-8B.Q8_0.gguf -v --log-timestamps

I always get a response, sometimes in an XML-like format, sometimes as plain conversation, and I've even confirmed with ChatGPT that some of my responses include tool calls, but no matter what, my lights do not turn off, my scenes do not trigger, etc.

Appreciate any help or advice anyone has!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions