Skip to content

A complete end-to-end machine learning project that performs sentiment analysis on Amazon Alexa product reviews using NLP and classification algorithms. Built with Python, this project features an interactive Flask web application that enables both single-text predictions and bulk CSV analysis of customer reviews. Watch project demo: πŸ‘‡

License

Notifications You must be signed in to change notification settings

vineet416/Amazon_Reviews_Sentiment_Analysis

Repository files navigation

Amazon Alexa Reviews Sentiment Analysis πŸŽ€πŸ“Š

A comprehensive machine learning project that performs sentiment analysis on Amazon Alexa product reviews. This project includes exploratory data analysis, multiple classification models, and a Flask web application for real-time predictions.

Python Flask ML

πŸ“‹ Table of Contents

🎯 Overview

This project analyzes customer sentiment from Amazon Alexa product reviews using Natural Language Processing (NLP) and Machine Learning techniques. The system can classify reviews as either Positive or Negative, helping businesses understand customer feedback at scale.

The project includes:

  • πŸ“Š Comprehensive data exploration and visualization
  • 🧹 Text preprocessing with lemmatization and stopword removal
  • πŸ€– Multiple ML models (Naive Bayes, Random Forest, XGBoost)
  • 🌐 Interactive Flask web application
  • πŸ“ Batch prediction support via CSV upload
  • πŸ“ˆ Visual analytics and sentiment distribution

✨ Features

  • Single Text Prediction: Analyze sentiment of individual review texts
  • Bulk Prediction: Upload CSV files for batch sentiment analysis
  • Visual Analytics: Automatic pie chart generation showing sentiment distribution
  • Pre-trained Models: Ready-to-use XGBoost classifier with TF-IDF vectorization
  • REST API: JSON-based API for integration with other applications
  • Text Preprocessing: Advanced NLP pipeline with lemmatization and POS tagging
  • Responsive UI: Clean, user-friendly web interface

πŸ“ Project Structure

Amazon Alexa Reviews Sentiment Analysis/
β”‚
β”œβ”€β”€ app.py                              # Flask application
β”œβ”€β”€ requirements.txt                    # Python dependencies
β”œβ”€β”€ README.md                           # Project documentation
β”‚
β”œβ”€β”€ Data/
β”‚   β”œβ”€β”€ amazon_alexa.tsv               # Original dataset
β”‚   β”œβ”€β”€ predictions.csv                # Sample predictions output
β”‚   └── SentimentBulk.csv              # Bulk prediction sample
β”‚
β”œβ”€β”€ Models/
β”‚   β”œβ”€β”€ xgboost_model.pkl              # Trained XGBoost classifier
β”‚   └── tfidfVectorizer.pkl            # Fitted TF-IDF vectorizer
β”‚
β”œβ”€β”€ templates/
β”‚   β”œβ”€β”€ landing.html                   # Landing page
β”‚   └── index.html                     # Prediction interface
β”‚
β”œβ”€β”€ Data Exploration & Modeling.ipynb  # Complete EDA and modeling notebook
β”‚
└── alexa/                             # Python virtual environment

πŸ“Š Dataset

The dataset contains Amazon Alexa product reviews with the following characteristics:

  • Source: Amazon customer reviews for Alexa devices - (Kaggle Dataset)
  • Format: Tab-separated values (TSV)
  • Features:
    • rating: Product rating (1-5 stars)
    • date: Review date
    • variation: Alexa device variant
    • verified_reviews: Customer review text
    • feedback: Sentiment label (1 = Positive, 0 = Negative)

πŸš€ Installation

Prerequisites

  • Python 3.12 or higher
  • pip package manager

Setup Steps

  1. Clone the repository
git clone https://github.com/vineet416/Amazon_Reviews_Sentiment_Analysis.git
cd Amazon_Reviews_Sentiment_Analysis
  1. Create a virtual environment (Optional but recommended)
conda create -p alexa python=3.12 -y
conda activate alexa/
  1. Install dependencies
pip install -r requirements.txt
  1. Download NLTK data
python -c "import nltk; nltk.download('stopwords'); nltk.download('wordnet'); nltk.download('averaged_perceptron_tagger'); nltk.download('punkt')"

πŸ’» Usage

Running the Web Application

  1. Start the Flask server
python app.py
  1. Access the application
    • Open your browser and navigate to http://localhost:5000
    • You'll see the landing page with options to start predictions

Single Text Prediction

  1. Navigate to the prediction page
  2. Enter your review text in the input field
  3. Click "Predict Sentiment"
  4. View the result (Positive/Negative)

Bulk Prediction (CSV Upload)

  1. Prepare a CSV file with a column named Sentence containing reviews
  2. Upload the file through the web interface
  3. Download the predictions CSV file
  4. View the sentiment distribution pie chart

API Usage

Endpoint: /predict

Method: POST

Request Body (JSON):

{
  "text": "I love my new Alexa device! It works perfectly."
}

Response:

{
  "result": "Positive"
}

πŸ“ˆ Model Performance

The project evaluates multiple machine learning algorithms:

Model Accuracy Description
XGBoost ~94-96% Best performing model (deployed)
Random Forest ~93-95% Strong ensemble method
SVM ~91-93% Support Vector Machine with RBF kernel
Logistic Regression ~89-91% Baseline linear model

Key Metrics:

  • Precision: High precision for both classes
  • Recall: Balanced recall scores
  • F1-Score: Strong F1-scores indicating good overall performance

πŸ› οΈ Technologies Used

Core Technologies

  • Python 3.12: Primary programming language
  • Flask: Web framework for the application
  • scikit-learn: Machine learning library
  • XGBoost: Gradient boosting framework
  • NLTK: Natural Language Processing

Data Science Stack

  • Pandas: Data manipulation and analysis
  • NumPy: Numerical computing
  • Matplotlib: Data visualization
  • Seaborn: Statistical data visualization
  • WordCloud: Text visualization

NLP Preprocessing

  • TF-IDF Vectorization: Text feature extraction
  • Lemmatization: Word normalization
  • POS Tagging: Part-of-speech identification
  • Stopwords Removal: Noise reduction

πŸ”Œ API Endpoints

Endpoint Method Description
/ GET Landing page
/predict GET Prediction interface page
/predict POST Predict sentiment (text or file)

πŸ“Έ Screenshots

Landing Page

The home page provides an introduction to the sentiment analysis tool.
Landing Page

Prediction Interface

Users can input text or upload CSV files for sentiment analysis.
Prediction Interface

Results Display

Shows prediction results with visual analytics for bulk predictions.
Results Display - Negative

Results Display - Positive & Batch Analysis

πŸ“ Key Insights

From the exploratory data analysis:

  1. Class Distribution: Most reviews are positive, reflecting high customer satisfaction
  2. Text Length: Review length correlates with sentiment intensity
  3. Rating Correlation: Strong correlation between star ratings and sentiment
  4. Common Words: Positive reviews mention "love", "great", "easy"; negative reviews mention "not", "work", "poor"
  5. Device Variations: Certain Alexa models receive more positive feedback

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is open source and available under the MIT License.

πŸ‘¨β€πŸ’» Author

Vineet Patel

πŸ™ Acknowledgments

  • Amazon for the Alexa reviews dataset
  • scikit-learn and XGBoost communities
  • Flask framework developers
  • NLTK contributors

πŸ“§ Contact

For questions or feedback, please open an issue in the repository.


⭐ If you found this project helpful, please give it a star!

About

A complete end-to-end machine learning project that performs sentiment analysis on Amazon Alexa product reviews using NLP and classification algorithms. Built with Python, this project features an interactive Flask web application that enables both single-text predictions and bulk CSV analysis of customer reviews. Watch project demo: πŸ‘‡

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published