Skip to content

[FEA] Make cudf::hash_partition less delicate for large numbers of partitions #21299

@wence-

Description

@wence-

Is your feature request related to a problem? Please describe.

cudf::hash_partition is very delicate when the number of requested partitions is large enough that we don't dispatch to "optimized" kernels (num_partitions > 1024).

In the following ways:

  1. The approach of tracking assignment of rows to partitions allocates a vector that turns out to be sparse in this case (so it can be much larger than the number of input rows, leading to out of memory).
  2. Even if we get through that, the same sparse vector ends up with more than uint32::max values, and so we hit the usual 32bit offset thrust errors.

It would be great if we could lift these restrictions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestNew feature or requestlibcudfAffects libcudf (C++/CUDA) code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions