-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Problem
I received an encrypted PDF invoice that opens fine in Preview, Adobe Reader, and Chrome (no password prompt - it uses an empty password). However, pypdf fails to decrypt it:
reader = PdfReader("invoice.pdf")
reader.decrypt("") # Returns 0 (NOT_DECRYPTED)
reader.pages[0] # Raises FileNotDecryptedErrorInvestigation
Using qpdf --show-encryption, I confirmed the PDF uses AESV2 with an empty user password. Looking at the encrypt dict, I found the issue:
Main Encrypt Dict:
/V: 4
/R: 4
# /Length is missing!
Crypt Filter:
/CFM: /AESV2
/Length: 16 (bytes, = 128 bits)
The PDF doesn't specify /Length in the main encrypt dict. pypdf defaults to 40 bits, but AESV2 requires 128 bits.
Problematic code
key_bits = encryption_entry.get("/Length", 40)This reads key length from the main encrypt dict, defaulting to 40 bits. For AESV2, the correct 128-bit value is only available in the Crypt Filter dict.
Proposed fix
When using AESV2 encryption (V=4), read the key length from the Crypt Filter dict instead of the main encrypt dict. The CF /Length is specified in bytes (default 16 for AES-128), so convert to bits by multiplying by 8.
Environment
- pypdf: 6.6.2
- Python: 3.11
- OS: macOS