This project implements an end-to-end fraud detection system using supervised machine learning techniques. The objective is to identify potentially fraudulent financial transactions based on transactional behavior and balance dynamics. The project covers data exploration, feature engineering, model training, and deployment through an interactive Streamlit application.
The dataset consists of transactional records with attributes related to:
- Transaction type
- Transaction amount
- Sender and receiver balances before and after the transaction
- Binary fraud label indicating fraudulent activity
The data exhibits strong class imbalance, which is typical in real-world fraud detection problems.
- Data cleaning and exploratory analysis
- Feature selection based on transactional and balance behaviour
- Categorical encoding of transaction types
- Train-test split with stratification
- Supervised classification model trained to identify fraudulent transactions
- Model persistence using serialised artefacts (
.pkl) for deployment
Key input features used by the model:
- Transaction type
- Transaction amount
- Sender balance before and after the transaction
- Receiver balance before and after the transaction
The model learns behavioural inconsistencies commonly associated with fraudulent transactions, such as abnormal balance movements and high-risk transaction types.
A Streamlit-based web application is built to:
- Accept user inputs for transaction features
- Run real-time fraud predictions using the trained model
- Display classification results in an intuitive interface
- Fraud vs non-fraud distribution analysis
- Feature behaviour analysis for high-risk transactions
- Model validation using held-out test data
- Focus on minimising false negatives while maintaining reasonable precision
- Install dependencies:
pip install -r requirements.txt