Sentiment Analysis using KNN

Introduction

This project aims to classify product reviews as positive or negative using machine learning algorithms. The classification is based on the sentiment expressed in the review text. We will employ various text vectorization techniques and machine learning models to achieve this classification.

Approach

The project follows these main steps:

Data Collection: Gather product reviews from online shopping websites. Each review consists of text and a corresponding rating indicating user sentiment.
Data Preprocessing:
- Remove HTML tags and punctuation from the review text.
- Filter out unnecessary elements and clean the text data.
- Ensure the correctness of the helpfulness ratio (numerator should be less than or equal to the denominator).
- Deduplicate the data based on user ID, profile name, time, and text.
Text Vectorization:
- Use Bag of Words (BOW) and TF-IDF to convert text data into numerical vectors.
- Utilize Word2Vec and TF-IDF Weighted Word2Vec for embedding-based vectorization.
Classification:
- Apply the K-nearest Neighbors (KNN) algorithm for classification.
- Evaluate the performance of the classification model using accuracy, precision, recall, and F1-score metrics.
Model Evaluation:
- Analyze the confusion matrix to understand the model's performance.
- Compute precision, recall, and F1-score for both positive and negative classes.
Results Analysis:
- Interpret the model's accuracy and performance metrics.
- Discuss strengths and weaknesses of the classification model.

Installation

Install Python (version 3.x).
Install the required libraries by running the following command in the terminal or command prompt:
```
pip install -r requirements.txt
```

Usage

Run the Amazon FineFood Sentiment KNN.ipynb Jupyter notebook to see the complete workflow.
Customize the code for your specific use case or dataset.
Experiment with different text vectorization techniques and machine learning algorithms.

Project Structure

Amazon FineFood Sentiment KNN.ipynb: Jupyter notebook containing source code and detailed explanations.
data/: Directory containing input data.
README.md: Project overview and usage guide documentation.

Technology Used

Programming Language: Python
Main Libraries: pandas, numpy, scikit-learn, nltk, gensim, matplotlib, seaborn

Author

BewxSevez

License

This project is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.idea		.idea
input		input
.gitattributes		.gitattributes
Amazon FineFood Sentiment KNN.ipynb		Amazon FineFood Sentiment KNN.ipynb
BOW.png		BOW.png
PCA.png		PCA.png
README.md		README.md
TF_IDF.gif		TF_IDF.gif
image.png		image.png
reviews.zip		reviews.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis using KNN

Introduction

Approach

Installation

Usage

Project Structure

Technology Used

Author

License

About

Uh oh!

Releases 1

Languages

abuhmhai/SentimentAnalysis_KNN

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis using KNN

Introduction

Approach

Installation

Usage

Project Structure

Technology Used

Author

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Languages