End-to-end Cancer Diagnosis using Machine Learning and Flask

Overview

A complete MLOps pipeline for cancer diagnosis that demonstrates data ingestion, transformation, model training & evaluation, and a Flask-based web UI for inference.

Dataset: This project uses the Cancer Prediction Dataset from Kaggle.

Features

Reproducible Pipeline: Modular scripts for ingestion, transformation, and training.
Robustness: Custom exception handling and logging for every step.
Web Interface: Clean Flask UI for real-time predictions.
Artifact Management: Systematically saves preprocessors and models for deployment.
Notebook Experiments: Comprehensive EDA and model experiments.
AWS Deployment-ready: Elastic beanstalk configuration file set up.

Preview (Flask Web UI)

Web.UI.demo.mp4

TODO

Add step-by-step instructions for Elastic beanstalk deployment to README.

How to run this project

Prerequisites:

conda
Git

Run:

# 1. Clone repository
git clone https://github.com/mrkomoruyi/Cancer-Diagnosis-MLOps-Pipeline.git
cd Cancer-Diagnosis-MLOps-Pipeline

# 2. Create env and install
conda create -p venv python==3.12 -y
conda activate venv/
pip install -r requirements.txt

# 3. Run the training script
python src/pipeline/train_pipeline.py

# 4. Run the web app
python application.py

Open http://localhost:5000/predict in your browser.

Project structure (high level)

notebook/: contains the raw dataset, exploratory data analysis (EDA), and experimentation notebooks.
artifacts/: stores generated outputs such as raw/processed data files, the trained model.pkl, and preprocessor objects.
src/: the core source code for the project:
- components/: modular scripts for Data Ingestion, Transformation, and Model Training.
- pipeline/: orchestration scripts for the Training and Prediction pipelines.
- utils.py, logger.py, exception.py: common utility functions, custom logging, and exception handling logic.
templates/: HTML files (index.html, home.html) for the Flask web interface.
application.py: the main entry point for the Flask web application.
setup.py & requirements.txt: configuration for project dependencies and package installation.

Contributing

If you find a bug, please submit an issue using the Issues tab.

If you want to submit a Pull Request, open an issue first and reference the issue in the pull request.

License

Distributed under the MIT License. See LICENSE for more information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-end Cancer Diagnosis using Machine Learning and Flask

Overview

Features

Preview (Flask Web UI)

TODO

How to run this project

Prerequisites:

Run:

Project structure (high level)

Contributing

License

About

Uh oh!

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.ebextensions		.ebextensions
notebook		notebook
src		src
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
application.py		application.py
requirements.txt		requirements.txt
setup.py		setup.py

License

mrkomoruyi/Cancer-Diagnosis-MLOps-Pipeline

Folders and files

Latest commit

History

Repository files navigation

End-to-end Cancer Diagnosis using Machine Learning and Flask

Overview

Features

Preview (Flask Web UI)

TODO

How to run this project

Prerequisites:

Run:

Project structure (high level)

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Languages

Packages