Skip to content

Core: Add org.apache.iceberg.DataFiles.Builder#withSortOrderId#15217

Open
raunaqmorarka wants to merge 1 commit intoapache:mainfrom
raunaqmorarka:raunaq/sort-order-id
Open

Core: Add org.apache.iceberg.DataFiles.Builder#withSortOrderId#15217
raunaqmorarka wants to merge 1 commit intoapache:mainfrom
raunaqmorarka:raunaq/sort-order-id

Conversation

@raunaqmorarka
Copy link
Contributor

Since sort id is the relevant field here, there should be an API to specify it directly instead of going through SortOrder object.

@github-actions github-actions bot added the core label Feb 2, 2026
Comment on lines +325 to +326
this.sortOrderId = newSortOrderId;
return this;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need assertions if the withSortOrder is setting some different sort order id ?

worth adding tests for someone who accidentally use both the APIs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a similar situation with org.apache.iceberg.DataFiles.Builder#withRecordCount vs org.apache.iceberg.DataFiles.Builder#withMetrics. I kept it simple since we don't try to validate those either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a caller uses both withSortOrder and withSortOrderId and the IDs differ, could that mislead planning or optimizations that rely on sortOrderId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expectation is that with either API, the input should be the correct one. It would be a bug in the caller if either API were given the wrong input.
If there is still concern about this, then may be I can mark org.apache.iceberg.DataFiles.Builder#withSortOrder as deprecated to encourage callers to just use withSortOrderId ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need assertions if the withSortOrder is setting some different sort order id ?

worth adding tests for someone who accidentally use both the APIs

This is not different than calling withSortOrder twice. The second call wins. That's also how builders works typically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants