🛡️ No Phishing Zone

A Machine Learning–Based Spam and Phishing Email Detection System

Author: Janice Sakina Akiza Nizigama
University: Aalborg University Copenhagen
Supervisor: Yan Kyaw Tun
Project Period: Spring 2025
Repository: JSanizi/no-phishing-zone

📖 Overview

No Phishing Zone is a research-based engineering project exploring how various machine learning algorithms can be trained and compared to detect spam and phishing emails effectively.
The project implements a complete spam-filtering pipeline, from dataset preprocessing and vectorization to model training, tuning, and evaluation, culminating in a functional spam filter application that simulates email classification.

The study compares seven models:

Autoencoder (Anomaly Detection)
Naive Bayes
Logistic Regression
Random Forest
AdaBoost
Gradient Boost
K-Nearest Neighbors

After comprehensive testing, Random Forest and Logistic Regression achieved the highest performance, with weighted F1-scores of 0.86 and 0.80, respectively.

🧠 Project Architecture

System Components

Data Preprocessing & Cleaning
- Loads datasets (SpamAssassin.csv, CEAS_08.csv)
- Cleans text and removes noise
- Converts text into TF-IDF vectors
Model Training & Hyperparameter Tuning
- Trains seven ML models using scikit-learn
- Performs parameter optimization with GridSearchCV
- Saves the best-performing models for deployment
Spam Filter Application
- Connects to an email inbox (via IMAP)
- Sends test emails to simulate spam delivery
- Classifies incoming emails using trained models
- Displays confusion matrices and evaluation reports

Model Architecture Example – Autoencoder

Trained only on non-spam samples
Detects spam by measuring reconstruction error thresholds

⚙️ Installation

Requirements

Python 3.10 or newer
pip package manager

Install dependencies

pip install -r requirements.txt

🚀 Usage

Step-by-step on how to run your scripts:

# 1️⃣ Preprocess the datasets
python training_and_tunning_models/data_preprocessing.py

# 2️⃣ Train and tune models
python training_and_tunning_models/train_and_parametertuning.py

# 3️⃣ Run the spam filter application
python application/ac_filter.py

📊 Evaluation Metrics

Models are evaluated with Accuracy, Precision, Recall, Weighted F1-score, and a Confusion Matrix.

Model	Weighted F1	Precision	Recall	Accuracy
Random Forest	0.86	0.87	0.84	0.86
Logistic Regression	0.80	0.82	0.78	0.81
AdaBoost	0.78	0.76	0.79	0.78
Gradient Boost	0.77	0.74	0.78	0.77
Naive Bayes	0.75	0.73	0.76	0.75
KNN	0.71	0.69	0.70	0.71
Autoencoder	0.68	0.66	0.70	0.68

🧩 Repository Structure

no-phishing-zone/
├── application/
│   ├── ac_filter.py
│   ├── connect_to_email.py
│   └── sending_mail.py
│
├── tests/
│   ├── test_integration.py
│   └── test_unit.py
│
├── training_and_tunning_models/
│   ├── ac_parameter_tuning.py
│   ├── autoencoder.py
│   ├── data_preprocessing.py
│   ├── parametertuning_table.py
│   ├── split_data.py
│   └── train_and_parametertuning.py
│
├── spam_filter.py
├── start_parametertuning.py
├── .gitattributes
├── .gitignore
└── README.md

📬 Contact

Janice Sakina Akiza Nizigama

🔗 LinkedIn: https://www.linkedin.com/in/janice-nizigama

💻 GitHub: https://github.com/JSanizi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ No Phishing Zone

📖 Overview

🧠 Project Architecture

System Components

Model Architecture Example – Autoencoder

⚙️ Installation

Requirements

Install dependencies

🚀 Usage

📊 Evaluation Metrics

🧩 Repository Structure

📬 Contact

🧾 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
application		application
tests		tests
training_and_tunning_models		training_and_tunning_models
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
spam_filter.py		spam_filter.py
start_parametertuning.py		start_parametertuning.py

JSanizi/no-phishing-zone

Folders and files

Latest commit

History

Repository files navigation

🛡️ No Phishing Zone

📖 Overview

🧠 Project Architecture

System Components

Model Architecture Example – Autoencoder

⚙️ Installation

Requirements

Install dependencies

🚀 Usage

📊 Evaluation Metrics

🧩 Repository Structure

📬 Contact

🧾 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages