Skip to content

submission: Add amity-sigma-v3r results from Amity#137

Open
touchaponk wants to merge 3 commits intosierra-research:mainfrom
touchaponk:submission/amity-sigma-v3r
Open

submission: Add amity-sigma-v3r results from Amity#137
touchaponk wants to merge 3 commits intosierra-research:mainfrom
touchaponk:submission/amity-sigma-v3r

Conversation

@touchaponk
Copy link

@touchaponk touchaponk commented Jan 12, 2026

Summary

Adding leaderboard submission for amity-sigma-v3r model from Amity.

Model Details

  • Model Name: amity-sigma-v3r
  • Base Model: Qwen3-4B-Thinking
  • Training: ROAD optimization + GRPO finetuning with human-in-the-loop synthetic data generation
  • Submission Type: Custom (modified retail policy.md for ROAD optimization)

Results

Domain Pass@1 Pass@2 Pass@3 Pass@4
Retail 78.51% 67.40% 60.53% 56.14%
Airline 55.50% 45.00% 38.50% 34.00%
Telecom 32.89% 24.71% 19.96% 16.67%

Trajectory Link

Trajectories are included in this PR under trajectories/trajectories.zip folder containing:

  • amity-sigma-v3r_airline_default_gpt-4.1_4trials.json
  • amity-sigma-v3r_retail_default_gpt-4.1_4trials.json
  • amity-sigma-v3r_telecom_default_gpt-4.1_4trials.json

Verification

  • All 3 domains evaluated (retail, airline, telecom)
  • 4 trials per task
  • User simulator: gpt-4.1
  • Trajectories available: Yes
  • Modified prompts: Yes (retail policy.md for ROAD optimization)
  • Omitted questions: No
  • submission_type: custom

References

Model: amity-sigma-v3r (Qwen3-4B-Thinking + ROAD + GRPO)
Organization: Amity
Submission Type: Custom

Results:
- Retail: Pass@1=78.51%, Pass@4=56.14%
- Airline: Pass@1=55.50%, Pass@4=34.00%
- Telecom: Pass@1=32.89%, Pass@4=16.67%

User simulator: gpt-4.1
Trajectories included for all 3 domains

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@touchaponk touchaponk force-pushed the submission/amity-sigma-v3r branch 2 times, most recently from a82ce62 to 2752836 Compare January 12, 2026 05:56
@benshi34
Copy link
Collaborator

Thanks for your submission! Most things look good to merge, could you provide the trajectories not as a zip file though so it can be accessed?

@touchaponk touchaponk force-pushed the submission/amity-sigma-v3r branch from 2752836 to c7478ef Compare January 28, 2026 12:19
@touchaponk
Copy link
Author

Hi @benshi34 - I have added in json trajectories and removed the zip file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants