A feature-rich desktop application built with Python and Tkinter for advanced PDF manipulation, including OCR, table/image extraction, encryption, and more.
- Extract Text: Extract plain text from PDFs
- OCR Text Extraction: Use Tesseract OCR to extract text from scanned PDFs
- Split PDFs: Extract specified page ranges into separate PDFs
- Merge PDFs: Combine multiple PDF files into one
- Images to PDF: Convert images into a single PDF
- Extract Images: Pull embedded images from PDFs and save them
- Extract Tables: Extract tables with
pdfplumber, save as CSV/PDF - Encrypt/Decrypt PDFs: Secure PDFs with passwords
- User-Friendly Interface: Intuitive GUI with sidebar controls and status display
- Python 3.11+
- Tkinter for GUI
- PyPDF2, pdfplumber, pytesseract
- OpenCV, NumPy, pandas
- Pillow, reportlab, pdf2image, PyMuPDF
-
Clone the repository (or download the
App.pyfile):git clone <repository_url> cd <repository_directory>
-
Install Python dependencies:
pip install PyPDF2 pdfplumber pandas opencv-python numpy tabulate pdf2image reportlab Pillow PyMuPDF pytesseract
-
Install Tesseract OCR Engine:
- Windows: Download the installer from Tesseract-OCR GitHub. During installation, note the installation path (e.g.,
C:\Program Files\Tesseract-OCR). - macOS:
brew install tesseract
- Linux (Debian/Ubuntu):
sudo apt-get install tesseract-ocr
- Windows: Download the installer from Tesseract-OCR GitHub. During installation, note the installation path (e.g.,
-
Configure Tesseract Path in the Script: Open
App.pyand update thepytesseract.pytesseract.tesseract_cmdvariable to point to your Tesseract executable. For example, on Windows:pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
On Linux/macOS, it might be:
pytesseract.pytesseract.tesseract_cmd = r'/usr/local/bin/tesseract' # Or wherever tesseract is installed
To run the application, simply execute the Python script:
python App.py📄 This project is licensed under the MIT License.
✅ You are free to:
- Use
- Modify
- Share (with attribution)
Made with 💙 by OMI-KALIX
For collaboration or deployment inquiries - contact via GitHub!