Skip to content

feat: add simple object storage cleaner script#1461

Merged
Rakanhf merged 1 commit intomasterfrom
QF-6039-object-storage-cleanup-script-simple
Feb 13, 2026
Merged

feat: add simple object storage cleaner script#1461
Rakanhf merged 1 commit intomasterfrom
QF-6039-object-storage-cleanup-script-simple

Conversation

@Rakanhf
Copy link
Contributor

@Rakanhf Rakanhf commented Jan 12, 2026

Currently QFieldCloud keeps all project files using the non-legacy storage even if they are deleted by the user. This is due to enabled versioning on Exoscale and the lack of Exoscale mechanism to delete files after certain amount of time after they are deleted. This causes excessive file storage and therefore costs.

  • Add purge_deleted_objects.py for scanning and deleting logically deleted objects in S3 compatable storages
  • Add test.py for testing the script
  • Add a README.MD

A small CLI to analyze and clean logically deleted objects in versioned S3-compatible buckets.

This script is provider-agnostic works with any S3-compatible service that supports versioning such as MinIO, etc.

It helps you to the following:

  1. Detect whether a bucket has versioning configured.
  2. Find keys whose latest version is a delete marker (logically deleted).
  3. Report how much storage their historical versions consume.
  4. Deleting all versions for those keys to free space.

Requirements

  • boto3
  • mypy-boto3-s3 Optional for type hinting

Usage

1. Scanning for logically deleted objects
python purge_deleted_objects.py my-versioned-bucket-name --retention-period <time-interval> --dry-run
2. Deleting logically deleted objects
python purge_deleted_objects.py my-versioned-bucket-name --retention-period <time-interval>

Options

Option Description
bucket_name The name of the target S3 bucket (Required).
--dry-run Only scan and report statistics without deleting.
--force Skip confirmation prompt (useful for automation).
--prefix <path> Limit scan to a specific prefix (folder).
--retention-period <time-interval> Only process objects deleted BEFORE this duration (e.g., 7 days, 30 minutes, 1 hour). Recent deletions are preserved.
--log-level Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL).

@duke-nyuki
Copy link
Collaborator

@Rakanhf Rakanhf changed the title feat: add object storage cleaner script and testing feat: add simple object storage cleaner script and testing Jan 12, 2026
Copy link
Collaborator

@suricactus suricactus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick round and this is very close to the final result.

The only major thing is that the date anchor is inverted, the rest are style, readability and maintainability comments.

@Rakanhf Rakanhf changed the title feat: add simple object storage cleaner script and testing feat: add simple object storage cleaner script Jan 16, 2026
Copy link
Collaborator

@suricactus suricactus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason we use pytest for the tests? Not that I am totally against, but anywhere else we used the unittest module (or inherited with the django test framework).

Can you add as part of this PR a CI that runs this test every time we push, so we can guarantee the tests are working?

Copy link
Collaborator

@suricactus suricactus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally managed to have a look on tests. They are good, IMO they can become even better, check the comments

Pretty sure this is the last round of review from my side, we are almost there. Once addressed the review, take it as approved from my side.

The biggest requested change is to make the --retention-period a required parameter. The rest are typing and "don't shoot my foot" suggestion.

Since this is a critical piece of software, please request a review from some of the other maintainers of QFieldCloud.

Copy link
Collaborator

@suricactus suricactus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally we are here.

Would be great to extensively test it in real world scenarios.

@gounux can you please add an additional review before this gets merged?

@suricactus suricactus requested a review from gounux February 5, 2026 13:51
Copy link
Member

@gounux gounux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could test it on my local minio, seems to work alright.

Please find my review round here, partly focused on doc, usage and some small adjustments.

Thank you for this crafted script @Rakanhf :)

@Rakanhf
Copy link
Contributor Author

Rakanhf commented Feb 11, 2026

I believe that now V1 of this script is ready @suricactus @gounux

Before I merge I will clean up the comments so it merges cleanly

@suricactus
Copy link
Collaborator

@gounux would you mind approving this PR and having it merged?

@Rakanhf Rakanhf force-pushed the QF-6039-object-storage-cleanup-script-simple branch from 08e5501 to 681318d Compare February 12, 2026 14:12
…bjects

Currently QFieldCloud keeps all project files using the non-legacy storage even if they are deleted by the user.
This is due to enabled versioning on Exoscale and the lack of Exoscale mechanism to delete files after certain amount of time after they are deleted.
This causes excessive file storage and therefore costs.

- Add `purge_deleted_objects.py` for scanning and deleting logically deleted objects in S3 compatable storages
- Add `test.py` for testing the script
- Add a `README.MD`
- Adjust `test.yml` to run ci tests
@Rakanhf Rakanhf force-pushed the QF-6039-object-storage-cleanup-script-simple branch from 681318d to f262f55 Compare February 13, 2026 09:55
@Rakanhf Rakanhf merged commit 3d4a9cb into master Feb 13, 2026
21 checks passed
@Rakanhf Rakanhf deleted the QF-6039-object-storage-cleanup-script-simple branch February 13, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants