Skip to content

Trying to use foundry local models in agent mode always results in an error 500 on the inference endpoint #333

@mswolters

Description

@mswolters

Any time I try to use a Foundry Local model in agent mode I get the following response (the \ missing between username and .vscode is not an editing mistake):

Sorry, your request failed. Please try again.
Copilot Request id: <id>
Reason: Unable to call the qwen2.5-coder-7b-instruct-generic-gpu:4inference endpoint due to 500.
Please check if the input or configuration is correct.: Error: Unable to call the qwen2.5-coder-7b-instruct-generic-gpu:4 inference endpoint due to 500. Please check if the input or configuration is correct. 
at t.InferenceError (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:5331866) 
at v.handleOpenAIError (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3378712) 
at v.chatStream (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3369931) 
at process.processTicksAndRejections (node:internal/process/task_queues:105:5) 
at async v.chatStream (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:2420198) 
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3963627 
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571996 
at async e.runWithTelemetry (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571824) 
at async t.ModelApi.provideLanguageModelResponse (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3961552)
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:2151102 
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571996 
at async e.runWithTelemetry (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571824) 
at async t.AitkModelChatProvider.provideLanguageModelChatResponse (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:2150842)

This appears to have been fixed in main foundry local: microsoft/Foundry-Local#336.
windowsaistudio.openAIInferencePort seems to set the hosted port, instead of connecting to the already running instance, so I'm unable to test.

Ai toolkit extension version: 0.28.0
VSCode version: 1.108.0
Full log:
Log.txt

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinginvestigatingThis issue is under investigating

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions