This project focuses on predicting customer churn in the telecom industry using machine learning models. By analyzing customer behavior and service usage data, we aim to identify customers at risk of leaving the service provider.
The dataset contains customer information such as demographic details, account length, service usage (international calls, voicemail, etc.), and whether they churned.
Target variable:
Churn
Type: Binary Classification (Yes/No)
- Data Preprocessing
- Feature Engineering
- Exploratory Data Analysis (EDA)
- Handling Categorical Variables
- Feature Scaling
- Model Training and Evaluation
- Logistic Regression for Churn Prediction
- Random forest model for better prediction
- Model Interpretation (coefficients and significance)
- Accuracy, Precision, Recall, F1 Score, ROC-AUC
- Confusion Matrix
- Insights and Business Recommendations
- Python
- Pandas, NumPy
- Matplotlib, Seaborn
- Scikit-learn
- Jupyter Notebook
- Cleaned and explored telecom customer data to identify important patterns.
- Applied Logistic Regression for churn prediction.
- Evaluated model using key classification metrics like Recall and ROC-AUC, especially critical for churn use cases.
- Provided insights for targeting at-risk customers with retention strategies.
- Importance of feature selection and interpretation in churn modeling.
- How to apply Logistic Regression in a real-world customer analytics scenario.
- Using classification metrics that align with business goals (e.g., prioritizing recall to avoid losing customers).
- Creating meaningful data visualizations to support decision-making.
- Interpreting model outputs to provide actionable business recommendations.