A Content-Based Movie Recommender System built using Machine Learning, NLP, and Streamlit.
This project recommends movies based on similarity of genres, keywords, cast, and crew using Bag of Words and Cosine Similarity.
- Select a movie from the dropdown
- Click Show Recommendation
- Get top 5 similar movies with posters
- Converted JSON-like strings using
ast.literal_eval - Extracted genres, keywords, cast, and crew
- Selected top 3 actors
- Removed spaces from multi-word names
- Combined features into a single
tagscolumn
- Used
CountVectorizer - Removed English stopwords
- Limited features to top 5000 words
- Applied stemming using
PorterStemmer
- Generated vectors for each movie
- Computed Cosine Similarity
- Built a similarity matrix to compare movies
- Saved processed data using
pickle - Built interactive UI using Streamlit
- Integrated TMDB API to fetch movie posters
- Python
- Pandas
- NumPy
- Scikit-learn
- NLTK
- Streamlit
- TMDB API
- Pickle
├── app.py
├── movie_list.pkl
├── similarity.pkl
├── requirements.txt
├── README.md
Clone the repository:
git clone https://github.com/your-username/movie-recommender.git
cd movie-recommenderInstall dependencies:
pip install -r requirements.txtRun the app:
streamlit run app.pyReplace the API key in app.py:
api_key = "YOUR_API_KEY"Get a free API key from:
https://www.themoviedb.org/
This system uses:
- Bag of Words model
- Cosine Similarity
- Content-Based Filtering
It recommends movies based on similar content, not user behavior.
- Use TF-IDF instead of CountVectorizer
- Implement collaborative filtering
- Use FAISS for scalable similarity search
- Deploy on cloud (Streamlit Cloud / AWS)
- Add movie overview and ratings