Skip to content

Clarify telecom data refueling rule in manual#105

Open
carol-xrl wants to merge 2 commits intosierra-research:mainfrom
carol-xrl:main
Open

Clarify telecom data refueling rule in manual#105
carol-xrl wants to merge 2 commits intosierra-research:mainfrom
carol-xrl:main

Conversation

@carol-xrl
Copy link

Summary

This PR solve the Reward Logic Flaw mentioned in issue#104 by updating the telecom manual by replacing the previous wording:

“The maximum amount of data that can be refueled is 2GB.”

with the more explicit rule:

“Whenever refueling is considered, the amount of data to be refueled must be exactly 2GB.”

Only this line in the manual is modified.

Why This Change Is Needed

The previous manual described a maximum limit, suggesting that the assistant may refuel any amount up to 2GB.
However, the reward function uses a strict assertion:


assert_data_refueling_amount == 2.0

meaning the evaluation framework requires exactly 2GB whenever refueling occurs.

This mismatch causes reasonable assistant behavior (e.g., refueling less than 2GB) to be incorrectly judged as failure, resulting in an inaccurate and unfair evaluation score.

[fix issue sierra-research#104: Reward Logic Flaw: Hard-Coded 2GB Refuel Requirement Causes Incorrect Evaluation] by replaceing "The maximum amount of data that can be refueled is 2GB" with a clearer rule: "Whenever refueling is considered, the amount of data to be refueled must be exactly 2GB."
[fix issue#104: Reward Logic Flaw: Hard-Coded 2GB Refuel Requirement Causes Incorrect Evaluation] by replaceing "The maximum amount of data that can be refueled is 2GB" with a clearer rule: "Whenever refueling is considered, the amount of data to be refueled must be exactly 2GB."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant