Core: Add org.apache.iceberg.DataFiles.Builder#withSortOrderId#15217
Core: Add org.apache.iceberg.DataFiles.Builder#withSortOrderId#15217raunaqmorarka wants to merge 1 commit intoapache:mainfrom
Conversation
| this.sortOrderId = newSortOrderId; | ||
| return this; |
There was a problem hiding this comment.
do we need assertions if the withSortOrder is setting some different sort order id ?
worth adding tests for someone who accidentally use both the APIs
There was a problem hiding this comment.
We have a similar situation with org.apache.iceberg.DataFiles.Builder#withRecordCount vs org.apache.iceberg.DataFiles.Builder#withMetrics. I kept it simple since we don't try to validate those either.
There was a problem hiding this comment.
if a caller uses both withSortOrder and withSortOrderId and the IDs differ, could that mislead planning or optimizations that rely on sortOrderId?
There was a problem hiding this comment.
The expectation is that with either API, the input should be the correct one. It would be a bug in the caller if either API were given the wrong input.
If there is still concern about this, then may be I can mark org.apache.iceberg.DataFiles.Builder#withSortOrder as deprecated to encourage callers to just use withSortOrderId ?
There was a problem hiding this comment.
do we need assertions if the withSortOrder is setting some different sort order id ?
worth adding tests for someone who accidentally use both the APIs
This is not different than calling withSortOrder twice. The second call wins. That's also how builders works typically.
Since sort id is the relevant field here, there should be an API to specify it directly instead of going through SortOrder object.