submission: Add amity-sigma-v3r results from Amity#137
Open
touchaponk wants to merge 3 commits intosierra-research:mainfrom
Open
submission: Add amity-sigma-v3r results from Amity#137touchaponk wants to merge 3 commits intosierra-research:mainfrom
touchaponk wants to merge 3 commits intosierra-research:mainfrom
Conversation
Model: amity-sigma-v3r (Qwen3-4B-Thinking + ROAD + GRPO) Organization: Amity Submission Type: Custom Results: - Retail: Pass@1=78.51%, Pass@4=56.14% - Airline: Pass@1=55.50%, Pass@4=34.00% - Telecom: Pass@1=32.89%, Pass@4=16.67% User simulator: gpt-4.1 Trajectories included for all 3 domains Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
a82ce62 to
2752836
Compare
Collaborator
|
Thanks for your submission! Most things look good to merge, could you provide the trajectories not as a zip file though so it can be accessed? |
2752836 to
c7478ef
Compare
Author
|
Hi @benshi34 - I have added in json trajectories and removed the zip file |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adding leaderboard submission for amity-sigma-v3r model from Amity.
Model Details
Results
Trajectory Link
Trajectories are included in this PR under
trajectories/trajectories.zipfolder containing:amity-sigma-v3r_airline_default_gpt-4.1_4trials.jsonamity-sigma-v3r_retail_default_gpt-4.1_4trials.jsonamity-sigma-v3r_telecom_default_gpt-4.1_4trials.jsonVerification
References