Skip to content

update:tongagents tau2bench result summit#145

Open
tongagents-tau2 wants to merge 6 commits intosierra-research:mainfrom
tongagents-tau2:tongagents-new-pr
Open

update:tongagents tau2bench result summit#145
tongagents-tau2 wants to merge 6 commits intosierra-research:mainfrom
tongagents-tau2:tongagents-new-pr

Conversation

@tongagents-tau2
Copy link

Hi maintainers,

We are submitting this PR to report new evaluation results on Tau2Bench.

Since the previous submission, we have run updated experiments and obtained improved scores under the current benchmark setup. This PR adds the latest results so that they can be tracked and compared with existing entries.

We would be happy to provide additional details or clarifications if needed, and we look forward to your feedback.
Thank you for maintaining Tau2Bench and for considering this update.

Best regards,
tongagents-tau2

update submission description
@victorb-sierra
Copy link
Collaborator

Thank you for your PR.
cc: @benshi34

@benshi34
Copy link
Collaborator

Thanks for your submission! Just getting a chance to take a look at this now: are there other links you can provide in the "references" section for your methodology? For example, the toolorchestra submission has a paper associated with it that details its methodology: we standardize in this requirement to prevent overfitting.

@tongagents-tau2
Copy link
Author

Thanks for your submission! Just getting a chance to take a look at this now: are there other links you can provide in the "references" section for your methodology? For example, the toolorchestra submission has a paper associated with it that details its methodology: we standardize in this requirement to prevent overfitting.

Thanks for taking the time to review our submission.

We have updated the methodology section accordingly and added a detailed description of our approach. The methodology document is available at the following URL:

https://raw.githubusercontent.com/tongagents-tau2/tau2-bench/refs/heads/tongagents-new-pr/web/leaderboard/public/submissions/gemini-3-pro_BIGAI_2026-01-15/methodology.md

Please let us know if any additional references or clarifications would be helpful.

@victorb-sierra
Copy link
Collaborator

Thank you for adding the methodology.md. If this gives some high level sense of what your method, it does not provide enough details for critical assessment. As suggested above, do you have a published paper that could provide such details?

@tongagents-tau2
Copy link
Author

tongagents-tau2 commented Feb 6, 2026

Thank you for adding the methodology.md. If this gives some high level sense of what your method, it does not provide enough details for critical assessment. As suggested above, do you have a published paper that could provide such details?

We are a closed-source commercial product and do not currently have an associated academic paper to submit.
and we are going through an internal process on open sourcing, but this has not been completed yet. feel free to reach us immediately if there are any other questions of our submission

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants