Skip to content

Conversation

@mroeschke
Copy link
Contributor

Description

cuML needed to work around a bug in rapidsai/cuml#7762 probably caused by #21281 where we were allowing np.dtype("str") though to our column logic. Generally pandas doesn't have support for this type and converts to np.dtype(object) to represent string instead which is what (IMO) cuDF should do too.

I have historically though that cuDF should disallow object type because it can mean "PyObject" type in pandas which we don't support. Now I'm starting to go backwards and think maybe cuDF should always just interpret it as string. For the "PyObject" cases in pandas, we can maybe just document this as an expected difference when using cudf.pandas

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke self-assigned this Feb 5, 2026
@mroeschke mroeschke requested a review from a team as a code owner February 5, 2026 22:21
@mroeschke mroeschke added bug Something isn't working Python Affects Python cuDF API. labels Feb 5, 2026
@mroeschke mroeschke added the non-breaking Non-breaking change label Feb 5, 2026
@mroeschke mroeschke requested a review from Matt711 February 5, 2026 22:21
@GPUtester GPUtester moved this to In Progress in cuDF Python Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant