fix: use RLock in ResourceScheduler to prevent deadlock #299

qodo-code-review · 2026-02-09T17:55:10Z

1. Silent allocation failure 🐞 Bug ✓ Correctness

• With the new RLock, allocate_resources no longer self-deadlocks and will proceed, but it can return partial/empty allocations when capacity is insufficient. • ExecutionEngine does not validate allocations before running steps, so under load pipelines can execute without acquiring requested CPU/memory/GPU (oversubscription / broken scheduling semantics). • allocate_resources still reports tracking status as "completed" even when required resources weren’t allocated, reducing operator visibility into the failure mode.

Agent Prompt

## Issue description `ResourceScheduler.allocate_resources()` can return an empty/partial allocations dict when a requested resource can’t be allocated (allocate_cpu/memory/gpu return `None`). `ExecutionEngine` proceeds to execute the pipeline anyway, so under contention the scheduler fails to enforce resource limits. ## Issue Context This PR changes the scheduler lock to `threading.RLock()`, which makes `allocate_resources()` actually execute (previously it would self-deadlock due to nested locking). That makes the silent-allocation-failure path operational. ## Fix Focus Areas - semantica/pipeline/resource_scheduler.py[180-240] - semantica/pipeline/resource_scheduler.py[248-290] - semantica/pipeline/resource_scheduler.py[343-386] - semantica/pipeline/execution_engine.py[171-176] ## Suggested approach 1. In `allocate_resources`, treat `cpu_cores` / `memory_gb` / `gpu_device` options as required when provided (and defaults likely required too). 2. If any required allocation returns `None`, immediately `release_resources()` for any already-acquired allocations in this call and raise a `ProcessingError`/`ValidationError`. 3. In `ExecutionEngine.execute_pipeline`, handle allocation failure by returning a failed `ExecutionResult` (or implement retry/backoff/queueing if that’s the intended UX).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

-Original file line number
+Diff line change
@@ Expand Up @@
             self.resources: Dict[str, Resource] = {}
             self.allocations: Dict[str, ResourceAllocation] = {}
-            self.lock = threading.Lock()
+            self.lock = threading.RLock()  # RLock: allocate_resources holds lock and calls allocate_cpu/memory/gpu which also acquire it
             self._initialize_resources()
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: use RLock in ResourceScheduler to prevent deadlock #299

Uh oh!

Diff view

Diff view

There are no files selected for viewing

qodo-code-review bot Feb 9, 2026

Uh oh!

Uh oh!

fix: use RLock in ResourceScheduler to prevent deadlock #299

Uh oh!

fix: use RLock in ResourceScheduler to prevent deadlock #299

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

qodo-code-review bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!