Skip to content

Detects anomalies using the Isolation Forest algorithm, with clear visual comparison between original data and anomaly-marked data in an unsupervised learning setup.

Notifications You must be signed in to change notification settings

btboilerplate/Anomaly-detection-using-IsolationForest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚨 Anomaly Detection using IsolationForest

Python scikit-learn Status

An unsupervised machine learning project that detects anomalies and outliers using the Isolation Forest algorithm, with visual comparison between original data and anomaly-marked data.


📌 Project Overview

This project demonstrates anomaly detection using Isolation Forest, an ensemble-based algorithm designed to isolate anomalies instead of profiling normal data points. Anomalies are detected based on how easily they are separated from the rest of the data.

The project includes:

  • Original dataset visualization
  • Dataset used for training
  • Final output with anomalies highlighted

📁 Project Structure

  • isolation_tree.ipynb — Main project notebook implementing Isolation Forest
  • healthcase.csv — Dataset used for anomaly detection
  • main_data.png — Visualization of the original dataset
  • anomalies_marked.png — Data with detected anomalies highlighted
  • README.md — Project documentation

⚙️ Technologies Used

  • Python
  • NumPy
  • Pandas
  • Matplotlib
  • scikit-learn
  • Jupyter Notebook

🧠 Machine Learning Model

  • Algorithm: Isolation Forest
  • Learning Type: Unsupervised Learning
  • Model Type: Tree-based ensemble
  • Use Case: Anomaly and Outlier Detection

📊 Visual Results

Original Data

Visualization of the dataset before applying anomaly detection.

Original Data


Anomalies Highlighted

Detected anomalies are marked distinctly to show deviation from normal patterns.

Anomalies Marked


▶️ How to Run

  1. Clone the repository
git clone https://github.com/btboilerplate/Anomaly-detection-using-IsolationForest.git  
  1. Install required libraries
pip install numpy pandas matplotlib scikit-learn  
  1. Open isolation_tree.ipynb and run all cells sequentially

🧪 Key Observations

  • Isolation Forest efficiently detects outliers in linear and non-linear data
  • Anomalies are isolated early due to random partitioning
  • Works well without labeled data
  • Scales efficiently to larger datasets

🚀 Future Improvements

  • Tune contamination parameter for better control
  • Compare with LOF and DBSCAN
  • Apply to real-world anomaly detection datasets
  • Visualize anomaly scores

About

Detects anomalies using the Isolation Forest algorithm, with clear visual comparison between original data and anomaly-marked data in an unsupervised learning setup.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published