Skip to content

Comments

Un 3388 Fixed Gemini LLM test case failure#330

Merged
vince-leaf merged 3 commits intomainfrom
un-3388_tweak_gemini_running_cost_hallucinate
Jul 25, 2025
Merged

Un 3388 Fixed Gemini LLM test case failure#330
vince-leaf merged 3 commits intomainfrom
un-3388_tweak_gemini_running_cost_hallucinate

Conversation

@vince-leaf
Copy link
Contributor

This PR addressed the failure of the Gemini LLM Smoke test case. In summary, three significant changes led to this failure:

  1. The fixes for duplication of system messages
  2. Update the test case to use "structure_formats"
  3. New version of langchain-google-genai-2.1.8 vs. 2.1.6

The current Gemini-2.0-flash model name used for this test is no longer working; it fails 90% of the time. I went with Gemini-2.5-flash and with the existing prompt that we are using for Anthropic and Ollama test cases.

"llm_config": {
"model_name": "gemini-2.0-flash",
"class": "gemini",
"model_name": "gemini-2.5-flash",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to another model name.


Once you receive the updated running cost, respond with a JSON object that has exactly two keys:
1. "answer" – your full answer to the user’s question.
2. "running_cost" – the updated cost returned by the Accountant tool.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the prompt used by Ollama and Anthropic LLM test cases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is Great and Good that we are trying to use the same prompt across multiple llms

langchain-aws>=0.2.27,<0.3
langchain-community>=0.3.19,<0.4
langchain-google-genai>=2.0.11,<3.0
langchain-google-genai>=2.1.8,<3.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumped up the version.

Copy link
Collaborator

@d1donlydfink d1donlydfink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM AFAICT

@vince-leaf vince-leaf merged commit 36b22bb into main Jul 25, 2025
4 checks passed
@vince-leaf vince-leaf deleted the un-3388_tweak_gemini_running_cost_hallucinate branch July 25, 2025 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants