[FEA] Make `cudf::hash_partition` less delicate for large numbers of partitions

**Is your feature request related to a problem? Please describe.**

`cudf::hash_partition` is very delicate when the number of requested partitions is large enough that we don't dispatch to "optimized" kernels (`num_partitions > 1024`).

In the following ways:

1. The approach of tracking assignment of rows to partitions allocates a vector that turns out to be sparse in this case (so it can be much larger than the number of input rows, leading to out of memory).
2. Even if we get through that, the same sparse vector ends up with more than uint32::max values, and so we hit the usual 32bit offset thrust errors.


It would be great if we could lift these restrictions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Make `cudf::hash_partition` less delicate for large numbers of partitions #21299

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA] Make cudf::hash_partition less delicate for large numbers of partitions #21299

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[FEA] Make `cudf::hash_partition` less delicate for large numbers of partitions #21299