feat: add simple object storage cleaner script#1461
Conversation
suricactus
left a comment
There was a problem hiding this comment.
Did a quick round and this is very close to the final result.
The only major thing is that the date anchor is inverted, the rest are style, readability and maintainability comments.
suricactus
left a comment
There was a problem hiding this comment.
Is there a specific reason we use pytest for the tests? Not that I am totally against, but anywhere else we used the unittest module (or inherited with the django test framework).
Can you add as part of this PR a CI that runs this test every time we push, so we can guarantee the tests are working?
suricactus
left a comment
There was a problem hiding this comment.
Finally managed to have a look on tests. They are good, IMO they can become even better, check the comments
Pretty sure this is the last round of review from my side, we are almost there. Once addressed the review, take it as approved from my side.
The biggest requested change is to make the --retention-period a required parameter. The rest are typing and "don't shoot my foot" suggestion.
Since this is a critical piece of software, please request a review from some of the other maintainers of QFieldCloud.
suricactus
left a comment
There was a problem hiding this comment.
Finally we are here.
Would be great to extensively test it in real world scenarios.
@gounux can you please add an additional review before this gets merged?
|
I believe that now V1 of this script is ready @suricactus @gounux Before I merge I will clean up the comments so it merges cleanly |
|
@gounux would you mind approving this PR and having it merged? |
08e5501 to
681318d
Compare
…bjects Currently QFieldCloud keeps all project files using the non-legacy storage even if they are deleted by the user. This is due to enabled versioning on Exoscale and the lack of Exoscale mechanism to delete files after certain amount of time after they are deleted. This causes excessive file storage and therefore costs. - Add `purge_deleted_objects.py` for scanning and deleting logically deleted objects in S3 compatable storages - Add `test.py` for testing the script - Add a `README.MD` - Adjust `test.yml` to run ci tests
681318d to
f262f55
Compare
Currently QFieldCloud keeps all project files using the non-legacy storage even if they are deleted by the user. This is due to enabled versioning on Exoscale and the lack of Exoscale mechanism to delete files after certain amount of time after they are deleted. This causes excessive file storage and therefore costs.
purge_deleted_objects.pyfor scanning and deleting logically deleted objects in S3 compatable storagestest.pyfor testing the scriptREADME.MDA small CLI to analyze and clean logically deleted objects in versioned S3-compatible buckets.
This script is provider-agnostic works with any S3-compatible service that supports versioning such as
MinIO, etc.It helps you to the following:
Requirements
Usage
1. Scanning for logically deleted objects
2. Deleting logically deleted objects
Options
bucket_name--dry-run--force--prefix <path>--retention-period <time-interval>7 days,30 minutes,1 hour). Recent deletions are preserved.--log-levelDEBUG,INFO,WARNING,ERROR,CRITICAL).