Robust Hybrid Model for Credit Card Fraud Detection.

This project demonstrates an automated credit card fraud detection system using Machine Learning techniques. Detecting fraudulent transactions is critical for financial institutions to prevent monetary loss and protect customers. In this project, transaction data is preprocessed, engineered, and analyzed using various classification models to identify potential fraud. This project was developed as part of my data science portfolio, focusing on real-time fraud detection in financial systems.

Publication

This project has been published in the conference IDC-IoT 2024 (Intelligent Data Communication Technologies and Internet of Things) under the title: "Robust Hybrid Machine Learning Model for Financial Fraud Detection in Credit Card Transactions". The paper presents the methodology, experimental results, and insights on hybrid machine learning approaches for accurate fraud detection.

Citation

If you use this project in your research or work, please cite the paper as follows: D. Jahnavi, M. A, S. Pulata, S. Sami, B. Vakamullu and B. Mohan G, "Robust Hybrid Machine Learning Model for Financial Fraud Detection in Credit Card Transactions," 2024 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), Bengaluru, India, 2024, pp. 680-686, doi: 10.1109/IDCIoT59759.2024.10467340. keywords: {Radio frequency;Logistic regression;Technological innovation;Sensitivity;Finance;Organizations;Fraud;Financialfraud;Comparisonanalysis;Hybridmodel;Contemporaryworld;Fraudulenttransactions;Flexibility},

Project Overview

The objective of this project is to automatically detect fraudulent credit card transactions. Using machine learning classification models such as Logistic Regression, Decision Tree, Random Forest, KNN, and hybrid approaches, the system analyzes transactional data to classify whether a transaction is legitimate or fraudulent. The workflow includes data preprocessing, feature engineering, model training, evaluation, and prediction, enabling accurate and efficient fraud detection.

Why I Chose This Project?

Credit card fraud is a major concern for financial institutions and customers alike. I chose this project because it addresses a real-world problem where timely detection of fraud can prevent financial losses. It also allowed me to gain hands-on experience with imbalanced datasets, feature engineering, model evaluation metrics, and ensemble learning. This project strengthened my skills in data preprocessing, machine learning, and predictive analytics.

Problem This Project Solves

Financial fraud can lead to significant monetary losses and customer trust issues. Manual monitoring of credit card transactions is inefficient and error-prone. This project provides an automated solution to detect fraudulent transactions in real-time, enabling financial institutions to act quickly, prevent losses, and ensure secure banking experiences for their customers.

Dataset

The dataset used consists of labeled credit card transactions, categorized as fraud or non-fraud. It contains features such as transaction amount, timestamp, customer demographics, and transaction metadata.

Dataset Link: Download Credit Card Fraud Dataset from Kaggle

The data preprocessing steps include dropping irrelevant columns, encoding categorical variables, normalizing features, and handling class imbalance through sampling. These steps prepare the dataset for accurate model training and evaluation.

Flow of the Project

The workflow of this project is designed to transform raw transaction data into actionable predictions. The steps include:

Load Dataset
Transaction data is loaded from CSV files.

Data Preprocessing

Dropping irrelevant columns

Encoding categorical variables

Extracting features from dates (e.g., DOB year/month)

Handling missing values and normalizing data

Dataset Sampling
To handle class imbalance, a small fraction of non-fraud transactions and a larger fraction of fraud transactions are sampled for training.

Splitting the Dataset
The data is split into training and testing sets using K-Fold cross-validation.

Model Building
Classification models used:

Logistic Regression

Post Lasso Logistic Regression

Decision Tree

Random Forest

K-Nearest Neighbors (KNN)

Hybrid Model (Logistic Regression + Decision Tree)

Model Training
Each model is trained on the training data, with hyperparameters optimized for best performance.

Evaluation
Performance metrics including accuracy, sensitivity, specificity, G-mean, and confusion matrices are calculated.

Prediction
Trained models predict fraudulent transactions on test data, allowing identification of high-risk transactions for further investigation.

Files in This Repository

credit_card_fraud_detection.ipynb – Jupyter Notebook containing the complete implementation

Research Paper – Published and presented in Prestegious conference IDC-IoT 2024 (Intelligent Data Communication Technologies and Internet of Things) requirements.txt – List of Python dependencies

README.md – Project documentation

Tech Stack Used and Why

Python: Core language for data analysis and model development

NumPy & Pandas: Numerical computations and data manipulation

Matplotlib & Seaborn: Visualization of data distributions, confusion matrices, and model performance

Scikit-learn: Machine learning models, cross-validation, and evaluation metrics

Imbalanced-learn: Handling imbalanced datasets

These tools provide an end-to-end ecosystem for data preprocessing, model development, evaluation, and visualization.

Usage Instructions

1. Clone the repository
git clone https://github.com/JAHNAVIDINGARI/CREDIT-CARD-FRAUD-DETECTION.git>

 2. Navigate to the project directory
 cd credit-card-fraud-detection
 3. Install dependencies
 pip install -r requirements.txt
 4. Run the notebook
 Open credit_card_fraud_detection.ipynb, update dataset path if required, and execute all cells to train and evaluate the models.
Results and Insights

The models trained on the credit card dataset achieved strong performance in identifying fraudulent transactions. Key observations include:
Decision Tree achieved high sensitivity (~80%) and G-mean (~0.89), indicating reliable fraud detection.
Random Forest reduced false positives while maintaining strong accuracy (~97%).
Logistic Regression and Post Lasso models provided baseline comparisons with slightly lower sensitivity.
Hybrid models combining Logistic Regression and Decision Tree improved overall predictive performance.
Confusion matrices demonstrated clear separation between fraud and non-fraud transactions. These results confirm that the system is robust for real-world financial applications and can assist institutions in minimizing fraudulent activity.
Authors

Jahnavi Dingari
Sandeep Pulata
Sasank Sami
Mona A
Bharadwaj V
Bharati Mohan G
Contact

For queries, collaboration, or further discussion regarding this project, please reach out via LinkedIn or email:
LinkedIn: https://www.linkedin.com/in/jahnavi-dingari
Email: jahnavidingari04@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
CERTIFICATE OF RESEARCH PAPER PRESENTATION.pdf		CERTIFICATE OF RESEARCH PAPER PRESENTATION.pdf
CODE.ipynb		CODE.ipynb
README.md		README.md
RESEARCH PAPER.pdf		RESEARCH PAPER.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robust Hybrid Model for Credit Card Fraud Detection.

Publication

Citation

Table of Contents

Project Overview

Why I Chose This Project?

Problem This Project Solves

Dataset

Flow of the Project

Files in This Repository

Tech Stack Used and Why

Usage Instructions

Results and Insights

Authors

Contact

About

Uh oh!

Releases

Packages

Languages

JAHNAVIDINGARI/Robust-Hybrid-model-for-credit-card-fraud-detection

Folders and files

Latest commit

History

Repository files navigation

Robust Hybrid Model for Credit Card Fraud Detection.

Publication

Citation

Table of Contents

Project Overview

Why I Chose This Project?

Problem This Project Solves

Dataset

Flow of the Project

Files in This Repository

Tech Stack Used and Why

Usage Instructions

Results and Insights

Authors

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages