Migrate from PyPDF2 to pypdf and remove obsolete mobi_to_json test#88
Merged
codeperfectplus merged 3 commits intoPy-Contributors:devfrom Dec 9, 2025
Merged
Migrate from PyPDF2 to pypdf and remove obsolete mobi_to_json test#88codeperfectplus merged 3 commits intoPy-Contributors:devfrom
codeperfectplus merged 3 commits intoPy-Contributors:devfrom
Conversation
codeperfectplus
approved these changes
Dec 9, 2025
Member
codeperfectplus
left a comment
There was a problem hiding this comment.
Thanks for improving the audiobook.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR modernizes the PDF handling library by migrating from the deprecated PyPDF2 to its actively maintained successor pypdf. The migration updates the dependency, refactors all PDF-related code to use the new API, and cleans up an obsolete test case for removed mobi functionality.
Key changes:
- Updated dependency from PyPDF2 3.0.1 to pypdf 4.0.1 with corresponding API migrations (PdfFileReader → PdfReader, method name updates)
- Renamed
PyPDF2DocParserclass toPyPDFDocParserto reflect the new library name - Removed obsolete mobi_to_json test case that referenced a function no longer in the codebase
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements.txt | Updated PDF library dependency from PyPDF2 3.0.1 to pypdf 4.0.1 |
| audiobook/doc_parser/pdf_parser.py | Migrated to pypdf API: updated imports, class name, and all method calls (PdfFileReader→PdfReader, numPages→len(pages), getPage→pages[], extractText→extract_text, getOutlines→outline) |
| audiobook/utils.py | Updated import statement to use PyPDFDocParser instead of PyPDF2DocParser |
| audiobook/main.py | Updated logger name from "PyPDF2" to "pypdf" to align with new library |
| tests/test_create_json_book.py | Renamed test from test_pdf_to_json_pypdf2 to test_pdf_to_json_pypdf; commented out obsolete mobi_to_json test |
| docs/command_line_usage.rst | Updated documentation to reference pypdf instead of pypdf2 in extraction engine table |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
35
to
36
| # def test_docs_to_json(self): | ||
| # self.assertEqual(ab.create_json_book("assets/sample.doc"), (output['docs'], {'book_name': 'sample', 'pages': 1})) |
There was a problem hiding this comment.
This comment appears to contain commented-out code.
Suggested change
| # def test_docs_to_json(self): | |
| # self.assertEqual(ab.create_json_book("assets/sample.doc"), (output['docs'], {'book_name': 'sample', 'pages': 1})) | |
| @unittest.skip("DOC to JSON test is currently disabled (e.g., due to missing support or failing test).") | |
| def test_docs_to_json(self): | |
| self.assertEqual(ab.create_json_book("assets/sample.doc"), (output['docs'], {'book_name': 'sample', 'pages': 1})) |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Deepak Raj <54245038+codeperfectplus@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Template
What have you Changed
Updated the code to use the modern PdfReader API.
Issue no.(must) - #87
Self Check(Tick After Making pull Request)
Join Us on Discord:- https://discord.gg/JfbK3bS