-
Notifications
You must be signed in to change notification settings - Fork 0
Fix fractional position sizing rebalancing on repeated actions #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When an agent repeated the same action (e.g. -1.0 = full short), the environment recalculated the target position from drifting portfolio value, triggering unintended close_partial/increase trades. Track the previous action value and skip recalculation when unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove unused `expected_direction` parameter from test parametrization - Add `test_action_change_after_repeated_holds_still_executes` to verify the fix doesn't lock agents into positions they cannot exit Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@claude review |
|
Claude finished @BY571's task in 1m 38s —— View job Code Review
SummaryThis PR successfully fixes issue #187 by preventing unnecessary rebalancing when agents repeat the same action. The fix is clean, well-tested, and addresses the root cause of the problem. Implementation Review (sequential.py)Changes at sequential.py:461 and sequential.py:604-608 The fix adds a simple guard that tracks the previous action value and skips recalculation when the action is unchanged and a position is already open: # If action hasn't changed and we already have a position, just hold
if action_value == self._prev_action_value and self.position.position_size != 0:
self.position.hold_counter += 1
return {"executed": False, "side": None, "fee_paid": 0.0, "liquidated": False}
self._prev_action_value = action_valueStrengths:
Potential edge case to consider: The guard checks
This is actually the intended behavior according to your PR description - the fix prevents recalculation when the action is unchanged. However, it's worth noting that this introduces a subtle behavioral change: with fractional sizing, the agent's effective exposure can drift with price movements, but the position size (in base units) stays constant when actions repeat. This is a trade-off:
Both are valid approaches, and your PR correctly chooses the latter. The PR description acknowledges this is for "repeated actions," so I assume this is the desired behavior. Test Review (test_sequential.py)test_repeated_action_does_not_rebalance (lines 806-857) Excellent regression test that:
test_action_change_after_repeated_holds_still_executes (lines 859-904) Critical complementary test that ensures the fix doesn't break normal functionality:
Test quality:
Minor observations
VerdictLGTM ✅ This is a clean, well-tested fix that addresses the root cause of issue #187. The implementation is simple, efficient, and maintains backward compatibility for all other behaviors. The test suite provides excellent coverage of both the fix and potential edge cases. The only suggestions above are minor style improvements - the functionality is correct as-is. |
Summary
Fixes #187
_execute_fractional_actionand skip recalculation when the action is unchanged and a position is already open_prev_action_valuetoNonein_reset_position_stateso it's cleared on episode boundariesTest plan
test_repeated_action_does_not_rebalanceverifies position size stays constant across 50 steps of repeated actions (spot and futures)🤖 Generated with Claude Code