Trying to use foundry local models in agent mode always results in an error 500 on the inference endpoint

Any time I try to use a Foundry Local model in agent mode I get the following response (the `\` missing between username and .vscode is not an editing mistake):

```
Sorry, your request failed. Please try again.
Copilot Request id: <id>
Reason: Unable to call the qwen2.5-coder-7b-instruct-generic-gpu:4inference endpoint due to 500.
Please check if the input or configuration is correct.: Error: Unable to call the qwen2.5-coder-7b-instruct-generic-gpu:4 inference endpoint due to 500. Please check if the input or configuration is correct. 
at t.InferenceError (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:5331866) 
at v.handleOpenAIError (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3378712) 
at v.chatStream (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3369931) 
at process.processTicksAndRejections (node:internal/process/task_queues:105:5) 
at async v.chatStream (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:2420198) 
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3963627 
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571996 
at async e.runWithTelemetry (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571824) 
at async t.ModelApi.provideLanguageModelResponse (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:3961552)
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:2151102 
at async c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571996 
at async e.runWithTelemetry (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:4571824) 
at async t.AitkModelChatProvider.provideLanguageModelChatResponse (c:\Users\<username>.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.28.0-win32-x64\dist\extension.js:2:2150842)
```

This appears to have been fixed in main foundry local: https://github.com/microsoft/Foundry-Local/issues/336.
`windowsaistudio.openAIInferencePort` seems to set the hosted port, instead of connecting to the already running instance, so I'm unable to test.

Ai toolkit extension version: 0.28.0
VSCode version: 1.108.0
Full log:
[Log.txt](https://github.com/user-attachments/files/24551700/Log.txt)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to use foundry local models in agent mode always results in an error 500 on the inference endpoint #333

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trying to use foundry local models in agent mode always results in an error 500 on the inference endpoint #333

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions