CS506 Midterm – Fraud Detection

Project Overview

This project was part of the CS506 Spring 2024 Midterm, focusing on detecting fraudulent credit card transactions using machine learning. The dataset was highly imbalanced (~0.4% fraud), requiring careful feature engineering and model selection. This project was hosted on Kaggle as a private competition.

Kaggle Competition: CS506 Midterm 2024 Achievement: Placed in Top 20 among all participants.

Key Features & Approach

Exploratory Data Analysis (EDA)
- Identified extreme class imbalance.
- Visualized geographical clusters of fraud using geopandas.
- Found strong correlation of fraud with high transaction amounts.
Feature Engineering
- Added age and distance (Haversine) features.
- Created average recent spend and fraudulent_day indicators.
- Applied k-means clustering to user and merchant locations.
- Performed label encoding and feature pruning using correlation analysis.
Modeling & Evaluation
- Tested Decision Tree, XGBoost, and KNN classifiers.
- Decision Tree performed best due to interpretability and robustness on imbalanced data.
- Used GridSearchCV for hyperparameter tuning.
- Verified model stability across multiple validation splits.
Results
- Achieved a Top 20 rank on the Kaggle leaderboard.
- Consistent performance across unseen validation sets.

Project Files

explore.ipynb – Exploratory Data Analysis and visualizations.
starter_code.ipynb – Initial setup and model experimentation.
U48519832_Midterm_Report.pdf – Detailed report with methodology and findings.
README.md – This file summarizing the project.

Insights & Learnings

Handling highly imbalanced datasets is challenging and requires thoughtful feature engineering.
Geographical features can add predictive power if transformed meaningfully.
Decision Trees provided interpretable and robust performance for this task.

Contact

Author: Mohit Sai Gutha
Email

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS506 Midterm – Fraud Detection

Project Overview

Key Features & Approach

Project Files

Insights & Learnings

Contact

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
U48519832_Midterm_Report.pdf		U48519832_Midterm_Report.pdf
explore.ipynb		explore.ipynb
starter_code.ipynb		starter_code.ipynb

Mohitsai/credit-card-fraud-detection-kaggle-competition

Folders and files

Latest commit

History

Repository files navigation

CS506 Midterm – Fraud Detection

Project Overview

Key Features & Approach

Project Files

Insights & Learnings

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages