Fix fractional position sizing rebalancing on repeated actions #188

BY571 · 2026-02-11T11:11:27Z

Summary

Fixes #187

Track the previous action value in _execute_fractional_action and skip recalculation when the action is unchanged and a position is already open
Reset _prev_action_value to None in _reset_position_state so it's cleared on episode boundaries
Add parametrized regression test covering spot-long, futures-long, and futures-short repeated actions

Test plan

New regression test test_repeated_action_does_not_rebalance verifies position size stays constant across 50 steps of repeated actions (spot and futures)
All 1011 existing tests pass (0 failures, 47 skipped)

🤖 Generated with Claude Code

When an agent repeated the same action (e.g. -1.0 = full short), the environment recalculated the target position from drifting portfolio value, triggering unintended close_partial/increase trades. Track the previous action value and skip recalculation when unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove unused `expected_direction` parameter from test parametrization - Add `test_action_change_after_repeated_holds_still_executes` to verify the fix doesn't lock agents into positions they cannot exit Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

BY571 · 2026-02-11T12:51:15Z

@claude review

claude · 2026-02-11T12:51:34Z

Claude finished @BY571's task in 1m 38s —— View job

Code Review

Read repository guidelines (CLAUDE.md)
Review implementation changes in sequential.py
Review test changes in test_sequential.py
Analyze logic correctness and edge cases
Provide comprehensive feedback

Summary

This PR successfully fixes issue #187 by preventing unnecessary rebalancing when agents repeat the same action. The fix is clean, well-tested, and addresses the root cause of the problem.

Implementation Review (sequential.py)

Changes at sequential.py:461 and sequential.py:604-608

The fix adds a simple guard that tracks the previous action value and skips recalculation when the action is unchanged and a position is already open:

# If action hasn't changed and we already have a position, just hold
if action_value == self._prev_action_value and self.position.position_size != 0:
    self.position.hold_counter += 1
    return {"executed": False, "side": None, "fee_paid": 0.0, "liquidated": False}
self._prev_action_value = action_value

Strengths:

Correct placement: The guard is positioned at the start of _execute_fractional_action, before any position calculations occur
Proper state management: _prev_action_value is initialized to None in _reset_position_state (line 461), ensuring clean episode boundaries
Hold counter update: Correctly increments hold_counter when holding a position
Returns early: Avoids unnecessary computation and maintains consistency with the existing tolerance-based hold logic below

Potential edge case to consider:

The guard checks self.position.position_size != 0, which means the first time an agent opens a position (from flat), the guard doesn't trigger. This is correct. However, consider this sequence:

Action 0.5 → Opens position
Action 0.5 → Guard triggers, holds ✓
Price changes significantly
Action 0.5 → Guard triggers, holds ✓ (even though target position may have drifted due to price change)

This is actually the intended behavior according to your PR description - the fix prevents recalculation when the action is unchanged. However, it's worth noting that this introduces a subtle behavioral change: with fractional sizing, the agent's effective exposure can drift with price movements, but the position size (in base units) stays constant when actions repeat.

This is a trade-off:

Before fix: Constant rebalancing to maintain portfolio allocation (constant-leverage behavior)
After fix: Hold position when action unchanged (constant-quantity behavior)

Both are valid approaches, and your PR correctly chooses the latter. The PR description acknowledges this is for "repeated actions," so I assume this is the desired behavior.

Test Review (test_sequential.py)

test_repeated_action_does_not_rebalance (lines 806-857)

Excellent regression test that:

Tests all three relevant modes: spot-long, futures-long, futures-short
Opens a position and repeats the same action 49 times
Asserts that position_size never changes (trades_executed == 0)
Has clear documentation explaining the issue

test_action_change_after_repeated_holds_still_executes (lines 859-904)

Critical complementary test that ensures the fix doesn't break normal functionality:

Verifies that after holding for 10 steps, changing the action still executes
Tests closing positions after repeated holds
Prevents regression where the guard might "lock" agents into positions

Test quality:

✅ Parametrized to cover multiple trading modes
✅ Clear test names and documentation
✅ Verifies both the fix and potential side effects
✅ Uses meaningful assertions with helpful error messages

Minor observations

Float equality comparison (sequential.py:605):
```
if action_value == self._prev_action_value and self.position.position_size != 0:
```
You're comparing floats with ==. This works because you're comparing the exact same value from self.action_levels[action_idx], not computed values. However, you might consider adding a comment to clarify this is intentional:
```
# Direct comparison is safe - both values come from action_levels array
if action_value == self._prev_action_value and self.position.position_size != 0:
```
Type annotation: Consider adding a type hint for _prev_action_value:
```
self._prev_action_value: Optional[float] = None
```
Though this is a minor style preference and not required.
Test coverage: The tests are comprehensive for the primary use case. One additional edge case test could verify behavior when switching between different non-zero actions (e.g., 0.5 → 0.8), but this is already implicitly covered by the tolerance-based logic below the guard.

Verdict

LGTM ✅

This is a clean, well-tested fix that addresses the root cause of issue #187. The implementation is simple, efficient, and maintains backward compatibility for all other behaviors. The test suite provides excellent coverage of both the fix and potential edge cases.

The only suggestions above are minor style improvements - the functionality is correct as-is.

| Branch: fix/187-fractional-rebalancing

BY571 and others added 2 commits February 11, 2026 12:11

BY571 merged commit 48b0a40 into main Feb 11, 2026
2 checks passed

BY571 deleted the fix/187-fractional-rebalancing branch February 11, 2026 13:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix fractional position sizing rebalancing on repeated actions #188

Fix fractional position sizing rebalancing on repeated actions #188

BY571 commented Feb 11, 2026

Uh oh!

BY571 commented Feb 11, 2026

Uh oh!

claude bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix fractional position sizing rebalancing on repeated actions #188

Fix fractional position sizing rebalancing on repeated actions #188

Conversation

BY571 commented Feb 11, 2026

Summary

Test plan

Uh oh!

BY571 commented Feb 11, 2026

Uh oh!

claude bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Summary

Implementation Review (sequential.py)

Test Review (test_sequential.py)

Minor observations

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Feb 11, 2026 •

edited

Loading