-
Notifications
You must be signed in to change notification settings - Fork 971
Description
Currently, BeaconProcessor is initialised with max_workers set to the number of available CPUs:
lighthouse/beacon_node/beacon_processor/src/lib.rs
Lines 251 to 254 in f67084a
| impl Default for BeaconProcessorConfig { | |
| fn default() -> Self { | |
| Self { | |
| max_workers: cmp::max(1, num_cpus::get()), |
Each task spawned by the processor consumes one worker. However, some tasks (e.g. Work::ColumnReconstruction) internally use rayon to parallelise computation. By default, rayon uses its global thread pool, which is also sized to the number of CPUs.
This likely results in CPU oversubscription: both Lighthouse and rayon independently assume full CPU availability, leading to degraded performance under load.
Proposed Solution
Consider allowing BeaconProcessor to allocate multiple workers to expensive tasks. When spawning such a task:
- Wait until
nworkers are free (e.g. 4), - Spawn the task with a scoped
rayonthread pool limited to thosenthreads, - Release the workers once the task completes.
This would preserve control over CPU usage while still enabling parallelism within heavy tasks.
Additional Details
Discussion below with @michaelsproul from the experimental PR #7720:
BeaconProcessor
Async task
- Keep computation light
- Any blocking computation go through the beacon processor
- Not bounded by max workers (
num_cpus) but probably still need a queuing system to make sure it doesn't go unbounded and cause memory issues
Blocking task
- Gets allocated 1 thread by default
- For heavy tasks that requires rayon, allow
WorkTypes to acquire more than 1 worker (N) - when executing, create a scopedrayonpool and run the parallel tasks within the scope, and release the workers after it completes
CPU allocation
- tokio runtime:
num_cpusthread (default) BeaconProcessor: running inside the tokio runtime, and maintains blocking workersmax_blocking_workers = num_cpusrayon: scoped per task
Memory
- Both sync and async tasks are bounded by queues of each individual work type.