This repository contains academic work completed for the Machine Learning course during my MSc in Computer Science at the University of New Brunswick.
It includes five programming projects using various machine learning algorithms in Python, as well as two detailed reports:
- 📄 Combined Programming Project Report
- 📄 Final Survey-Based Report – Phishing Detection Using Machine Learning
This project implements the Adaboost algorithm using ID3 (custom decision tree) as the base learner. The model is trained and evaluated on the Letter Recognition Dataset from the UCI Machine Learning Repository.
- ID3 Decision Tree (custom implementation)
- Adaboost Ensemble Learning
- UCI Letter Recognition Dataset
- 16 integer features
- Target: Capital letters A–Z
- Improved accuracy with Adaboost compared to standalone ID3
- Performance analysis included in
AdaBoost_ID3_Letter_Recognition_Evaluated.ipynb
AdaBoost_ID3_Letter_Recognition.ipynb– Model trainingAdaBoost_ID3_Letter_Recognition_Evaluated.ipynb– Evaluationletter-recognition.data,.names,.data.Z– Dataset files
📂 Project 2 – Breast Cancer Diagnosis Using ANN (Breast Cancer Dataset)
🔍 Overview
This project builds a custom Artificial Neural Network (ANN) from scratch in Python to classify breast cancer cases using the Breast Cancer Wisconsin dataset.
📌 Key Features
- Manual ANN implementation (forward/backward pass)
- Evaluation with accuracy and a confusion matrix
- Dataset preprocessing and binary classification
🔸 This project uses the Naive Bayes algorithm to classify car acceptability (unacc, acc, good, vgood) using the UCI Car Evaluation dataset.
🔸 It includes training, evaluation, and performance analysis using a confusion matrix and accuracy.
Project 4 – MNIST Digit Classification using ANN This project implements an Artificial Neural Network (ANN) from scratch using Python to classify handwritten digits from the MNIST dataset.
📌 Key Highlights Built a multi-layer ANN without any ML libraries. Preprocessed raw .idx MNIST files directly. Achieved high classification accuracy on test data. 📂 Files Included MNIST_ANN_Classification.ipynb: Main notebook train-images.idx3-ubyte, train-labels.idx1-ubyte: Training data t10k-images.idx3-ubyte, t10k-labels.idx1-ubyte: Test data
This project applies a Convolutional Neural Network (CNN) on the MNIST dataset to classify handwritten digits. The architecture was built and tuned from scratch to enhance training and test data accuracy.
- Developed a CNN model using Keras
- Achieved over 98% accuracy on test data
- Compared different optimizers and layer configurations
- Python
- TensorFlow / Keras
- NumPy, Matplotlib
jupyter notebook mnist_cnn_detailed_experiments.ipynb
- **Title:** Phishing Detection Using Machine Learning
- **Authors:** Raju Deb & Sadman Sakib Choudhury
- **Abstract:** This report surveys modern phishing detection strategies using ML techniques such as ensemble learning, NLP, and deep neural networks. It critically compares models, datasets, limitations, and future directions in adversarial resilience, scalability, and privacy.
- [📘 Read the Full Report](Survey_Report_Phishing_Detection_ML.pdf)