Skip to content

PIL Image size limits #168

@RichardScottOZ

Description

@RichardScottOZ

Probably not an issue in a current journal based workflow.

However, in older stuff this can tend to happen

 | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 33% CPU time recently (threshold: 10%)
worker_1         | ERROR :: 2022-08-12 10:18:02,769 :: Image size (305490136 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
worker_1         | Traceback (most recent call last):
worker_1         |   File "/ingestion/ingest/ingest.py", line 297, in pdf_to_images
worker_1         |     img = Image.open(bytesio).convert('RGB')
worker_1         |   File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 3009, in open
worker_1         |     im = _open_core(fp, filename, prefix, formats)
worker_1         |   File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 2996, in _open_core
worker_1         |     _decompression_bomb_check(im.size)
worker_1         |   File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 2905, in _decompression_bomb_check
worker_1         |     raise DecompressionBombError(
worker_1         | PIL.Image.DecompressionBombError: Image size (305490136 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
worker_1         | ERROR :: 2022-08-12 10:18:02,770 :: Image opening error pdf: Rec1951_067.pdf
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 33% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 32% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - INFO - full garbage collection released 31.17 MiB from 9565 reference cycles (threshold: 9.54 MiB)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 33% CPU time recently (threshold: 10%)
worker_1         | distributed.utils_perf - WARNING - full garbage collections took 33% CPU time recently (threshold: 10%)

Is there any reason not to have the limit be None [other than not having come across it]? Other than RAM could blow up - but at machines the size used here, pretty unlikely - given document size known too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions