๐ A modern web application for comparing two documents and determining their similarity percentage using advanced AI algorithms. Supports PDF, DOC, DOCX, and TXT files with a beautiful glassmorphism UI design.
โ Upload Two Documents - Support for PDF, DOC, DOCX, and TXT files โ AI-Powered Analysis - Advanced natural language processing using spaCy โ Similarity Scoring - Get accurate percentage similarity between documents โ Text Extraction - Automatic text extraction from various file formats โ Real-time Results - Instant similarity analysis and scoring
โ Glassmorphism Design - Beautiful frosted glass effects โ Dark/Light Mode - Automatic theme detection with manual toggle โ Responsive Design - Works perfectly on all devices โ Smooth Animations - Modern transitions and hover effects โ Gradient Cards - Visual differentiation with multiple gradient styles
โ User Registration & Login - Secure authentication system โ Profile Management - User dashboard and settings โ Session Management - Secure user sessions
- Django - Python web framework
- spaCy - Natural language processing
- NLTK - Text processing toolkit
- scikit-learn - Machine learning algorithms
- HTML5 - Semantic markup
- CSS3 - Modern styling with custom properties
- JavaScript - Interactive theme management
- Font Awesome - Beautiful icons
- TF-IDF Vectorization - Text similarity analysis
- Cosine Similarity - Document comparison algorithm
- Text Preprocessing - Tokenization and lemmatization
- ๐ค Upload Documents - Select two files to compare (PDF, DOC, DOCX, or TXT)
- ๐ค AI Analysis - Advanced algorithms process and analyze the text content
- ๐ Get Results - Receive similarity percentage and detailed analysis
- ๐จ Modern UI - Enjoy the beautiful interface with theme switching
- Python 3.8+
- Django 4.0+
- spaCy with English model
- NLTK data packages
- PyPDF2 for PDF processing
- python-docx for Word document processing
# Clone the repository
git clone https://github.com/skp3214/document-similarity-checker.git
cd document-similarity-checker
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
# Download NLTK data
python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('wordnet')"
# Download spaCy model
python -m spacy download en_core_web_sm# Navigate to project directory
cd Doc_Scanner_Matcher
# Run migrations
python manage.py migrate
# Create superuser (optional)
python manage.py createsuperuser
# Start development server
python manage.py runserverOpen your browser and go to: http://127.0.0.1:8000/
- ๐ Home Page - Landing page with modern design
- ๐ค Upload Documents - Select two files to compare
- ๐ View Results - See similarity percentage and analysis
- ๐ค Profile - Manage your account and settings
- ๐ Theme Toggle - Switch between light and dark modes
- ๐ PDF - Portable Document Format
- ๐ DOC/DOCX - Microsoft Word documents
- ๐ TXT - Plain text files
- ๐ Text Extraction - Automatic content extraction
- ๐ง Semantic Analysis - Understanding document meaning
- ๐ Similarity Scoring - Percentage-based comparison
- ๐ฏ Content Matching - Advanced text comparison algorithms
- Glassmorphism - Frosted glass effects with backdrop blur
- Gradient Cards - Multiple gradient styles for visual appeal
- Smooth Animations - Hover effects and page transitions
- Responsive Layout - Optimized for all screen sizes
- ๐ Light Mode - Clean, bright interface
- ๐ Dark Mode - Easy on the eyes with modern aesthetics
- Auto Detection - Respects system preferences
- Manual Toggle - One-click theme switching
For questions, feedback, or contributions:
๐ง Email: spsm1818@gmail.com
๐ GitHub: skp3214
- Use GitHub Issues for bug reports
- Feature requests and UI/UX suggestions are welcome!
- Pull requests for improvements are encouraged
- ๐จ Modern UI Complete - Glassmorphism design with dark/light mode
- ๐ง Fully Functional - Document comparison working perfectly
- ๐ฑ Mobile Responsive - Optimized for all screen sizes
- โฟ Accessible - WCAG compliant design
- ๐ Production Ready - Optimized for deployment
- ๐ Multi-language Support (i18n)
- ๐ Advanced Analytics with comparison history
- ๐ Batch Processing for multiple document pairs
- ๐ฑ Progressive Web App (PWA) features
- ๐ฏ API Endpoints for third-party integrations
Built with โค๏ธ using Django, modern CSS, and AI-powered NLP


