This notebook was developed for Pinktober, a Micro Club internal datathon held in October during Breast Cancer Awareness Month.
To build a robust machine learning model to classify breast cancer cases based on anonymized patient data.
- Data Exploration: Statistical summaries and visualizations
- Preprocessing: Feature selection, normalization with
MinMaxScaler - Modeling: Tried Logistic Regression, KNN, Decision Trees, and Random Forests
- Evaluation: Confusion matrix, classification report, and metrics (Precision, Recall, F1)
- Accuracy: 0.98246
- Precision: 0.97872
- Recall: 0.97872
- F1-score: 0.97872
Python, Pandas, Seaborn, Scikit-learn, Matplotlib
Created and submitted during a 48-hour ML datathon organized by Micro Club.