Skip to content

Coalesce partitions above and below network coalesce#312

Open
gabotechs wants to merge 1 commit intomainfrom
gabrielmusat/coalesce-partitions-below-network-coalesce
Open

Coalesce partitions above and below network coalesce#312
gabotechs wants to merge 1 commit intomainfrom
gabrielmusat/coalesce-partitions-below-network-coalesce

Conversation

@gabotechs
Copy link
Collaborator

Rework of #285.


Coalesces partitions before and after a network boundary, instead of just after it:

This:

        │     CoalescePartitionsExec
        │       [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3
        └──────────────────────────────────────────────────
          ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] 
          │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))]
          │   ProjectionExec: expr=[]

Becomes this:

        │     CoalescePartitionsExec
        │       [Stage 1] => NetworkCoalesceExec: output_partitions=3, input_tasks=3
        └──────────────────────────────────────────────────
          ┌───── Stage 1 ── Tasks: t0:[p0] t1:[p1] t2:[p2] 
          │ CoalescePartitionsExec
          │   AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))]
          │     ProjectionExec: expr=[]

This allows us to:

  1. Simplify the network communication between the boundaries
  2. Do some early work in a distributed manner for when coalescing implies ordering with SortMergePreservingExec

@gabotechs gabotechs marked this pull request as ready for review January 26, 2026 15:28
Copy link
Collaborator

@gene-bordegaray gene-bordegaray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, one small nit

@gabotechs gabotechs force-pushed the gabrielmusat/coalesce-partitions-below-network-coalesce branch from 1c8c43f to 8639326 Compare January 26, 2026 16:24
@gabotechs
Copy link
Collaborator Author

Benchmarking this now...

@gabotechs
Copy link
Collaborator Author

I'm getting pretty bad results with the benchmarks for this one... I'm still double checking if it's something wrong with the benchmarks themselves

@gabotechs gabotechs force-pushed the gabrielmusat/coalesce-partitions-below-network-coalesce branch from 8639326 to d9c3152 Compare January 26, 2026 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants