Fix Memory Leaks and Improve Cleanup in S3 Multipart Upload by kumarpritam863 · Pull Request #15210 · apache/iceberg

kumarpritam863 · 2026-02-01T12:04:17Z

Cleanup staging files list and multipart upload after close() and ubortUpload() after closing all the staging files.

This reverts commit 67619ec.

This reverts commit c0a2665.

Clear staging files and multipart map after deletion.

kumarpritam863 · 2026-02-01T13:48:14Z

@singhpk234 can you please review.

kumarpritam863 · 2026-02-01T15:01:45Z

@RussellSpitzer can you please take a look.

singhpk234 · 2026-02-01T16:59:31Z

aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputStream.java

+    // clear staging files and multipart map
+    stagingFiles.clear();
+    multiPartMap.clear();


its fine to do eager cleanup
though wouldn't the stream post abort closed and hence GCed ? is this staying in memory for long

Thanks @singhpk234 for the review.

Regarding memory management:
While the staging files list will eventually allow objects to be garbage-collected once they go out of scope, I’m concerned that retaining strong references to many FileAndDigest objects (especially in upload-heavy / long-running workloads) can still cause practical issues:

Increased heap pressure during periods of high concurrent or sequential uploads

Longer object lifetime → more frequent / longer GC pauses

Higher risk of OutOfMemoryError during peak load (I’ve sometimes observed OOMs in similar scenarios when large numbers of parts accumulate without cleanup while running Iceberg-Kafka-Connect)

Even though the theoretical lifetime is finite, the practical memory pressure and GC overhead seem non-negligible in our use case.

Also although it does not effect the AWS multipart upload as AWS requires the part number to be unique but starting the part number from 1 and keeping it in low bounds make managing CompleteMultipartUpload requests easier. Currently the part number comes from the Index() of the part-file from staging files list which can start from a higher number if the previous files are not cleared.

Please let me know your thoughts on these.

… map

kumarpritam863 · 2026-02-03T04:55:19Z

Hi @singhpk234 I have also added the tests for proper cleaning of staging files and multipart map. Can you please review once.

Pritam Kumar Mishra and others added 15 commits August 9, 2025 11:27

added metadat and data path in case of dynamic routing

c0a2665

spotless

67619ec

Revert "spotless"

6b15ae4

This reverts commit 67619ec.

Revert "added metadat and data path in case of dynamic routing"

8398e4c

This reverts commit c0a2665.

Merge branch 'apache:main' into main

fbf52a9

Merge branch 'apache:main' into main

c92ec66

Merge branch 'apache:main' into main

9392a6d

Merge branch 'apache:main' into main

ecd8b55

Merge branch 'apache:main' into main

5e76e04

Merge branch 'apache:main' into main

a1ec7e6

Merge branch 'apache:main' into main

4eaf70b

Merge branch 'apache:main' into main

1508513

Merge branch 'apache:main' into main

e5908c8

Merge branch 'apache:main' into main

cbefe9a

Clear staging files and multipart map on completion

e1cb5be

Clear staging files and multipart map after deletion.

github-actions bot added the AWS label Feb 1, 2026

singhpk234 reviewed Feb 1, 2026

View reviewed changes

Pritam Kumar Mishra added 2 commits February 2, 2026 22:58

added test cases to very the clearance of staging files and multiPart…

21711a9

… map

fixed spotless errors

8f18332

kumarpritam863 requested a review from singhpk234 February 4, 2026 15:06

removed reflection

30a287a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Memory Leaks and Improve Cleanup in S3 Multipart Upload#15210

Fix Memory Leaks and Improve Cleanup in S3 Multipart Upload#15210
kumarpritam863 wants to merge 18 commits intoapache:mainfrom
kumarpritam863:fix/s3-multipart-upload-cleanup

kumarpritam863 commented Feb 1, 2026

Uh oh!

kumarpritam863 commented Feb 1, 2026

Uh oh!

kumarpritam863 commented Feb 1, 2026

Uh oh!

singhpk234 Feb 1, 2026 •

edited

Loading

Uh oh!

kumarpritam863 Feb 2, 2026

Uh oh!

kumarpritam863 commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kumarpritam863 commented Feb 1, 2026

Uh oh!

kumarpritam863 commented Feb 1, 2026

Uh oh!

kumarpritam863 commented Feb 1, 2026

Uh oh!

singhpk234 Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kumarpritam863 Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

kumarpritam863 commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

singhpk234 Feb 1, 2026 •

edited

Loading