Skip to content

[Data] Include logical memory in resource manager scheduling decisions#60774

Open
machichima wants to merge 6 commits intoray-project:masterfrom
machichima:60744-logical-mem-schedule
Open

[Data] Include logical memory in resource manager scheduling decisions#60774
machichima wants to merge 6 commits intoray-project:masterfrom
machichima:60744-logical-mem-schedule

Conversation

@machichima
Copy link
Contributor

Description

Rename:

  • current_processor_usage to current_logical_usage
  • pending_processor_usage to pending_logical_usage
  • running_processor_usage to running_logical_usage

and update the methods to include logical memory (memory) alongside CPU and GPU. This ensures the resource manager accounts for memory when making scheduling decisions.

Add new test test_memory_limit_blocks_task_submission to ensure task submission is blocked when a task requests more memory than available

Related issues

Closes #60744

Additional information

Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
@machichima machichima requested a review from a team as a code owner February 5, 2026 13:43
@machichima
Copy link
Contributor Author

@bveeramani PTAL, thank you!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly includes logical memory in the resource manager's scheduling decisions. The changes primarily involve renaming ..._processor_usage methods to ..._logical_usage and updating them to account for memory (memory) alongside CPU and GPU. The implementations in ActorPoolMapOperator, TaskPoolMapOperator, and HashShuffleOperator are updated to calculate memory usage from ray_remote_args. The changes are consistent across the codebase, and the new test test_memory_limit_blocks_task_submission effectively verifies that task submission is blocked when memory limits are exceeded. I have one minor suggestion to improve a docstring for clarity. Overall, the code looks good and the changes are well-implemented.

Comment on lines 686 to 687
This method is called by the executor to decide how to allocate processors
between different operators.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better clarity and to align with the changes in this PR that add memory to resource calculations, consider updating the term "processors" to "resources" in this docstring.

Suggested change
This method is called by the executor to decide how to allocate processors
between different operators.
This method is called by the executor to decide how to allocate resources
between different operators.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

can_submit = allocator.can_submit_new_task(o2)
assert not can_submit, (
"Task should be blocked: requires 15GB but only 10GB available"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test assertion is conditional and may never execute

Medium Severity

The new test test_memory_limit_blocks_task_submission has its key assertion wrapped in if allocator is not None:. If _op_resource_allocator is None (which happens when op_resource_reservation_enabled is False), the test passes without verifying anything. The assertion that memory limits block task submission simply never executes, defeating the test's stated purpose of verifying memory limiting behavior.

Fix in Cursor Fix in Web

@ray-gardener ray-gardener bot added the community-contribution Contributed by the community label Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Data] Include logical memory in resource manager scheduling decisions

1 participant