Welcome!
This repository brings together case studies and practical solutions developed to solve real-world problems using Data Science, Machine Learning, and Artificial Intelligence.
Each project is based on a real scenario, focusing on delivering actionable insights and measurable results.
Each project folder contains:
- README.md → Problem description, context, objectives, and methodology.
├── LICENSE <- Open-source license if one is chosen
├── Makefile <- Makefile with convenience commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── docs <- A default mkdocs project; see www.mkdocs.org for details
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks.
| ├── 01_data_understanding.ipynb
│ ├── 02_data_preparation.ipynb
│ ├── 03_modeling.ipynb
│ └── 04_evaluation.ipynb
│
├── pyproject.toml <- Project configuration file with package metadata for
│ cookie_cutter_framework and configuration for tools like black
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── setup.cfg <- Configuration file for flake8
│
└── src <- Source code for use in this project.
│
├── __init__.py <- Makes cookie_cutter_framework a Python module
│
├── config.py <- Store useful variables and configuration
│
├── dataset.py <- Scripts to download or generate data
│
├── features.py <- Code to create features for modeling
│
├── modeling
│
└── plots.py <- Code to create visualizations
- Projects based on real business and industry challenges.
- Use of structured methodologies such as CRISP-DM and DataOps.
- Focus on best practices: version control, reproducibility, and clear documentation.
- Diverse applications: forecasting, classification, anomaly detection, optimization, and exploratory analysis.
- For shorter and experimental projects, check the cases_study folder.
This is a living repository and will be continuously updated with new projects covering various sectors:
- 🏭 Industry & Manufacturing
- 🛢️ Oil & Gas
- 🏦 Finance
- 🛒 Retail & E-commerce
- 🏥 Healthcare
💡 Tip: Read each project's
README.mdto understand the problem, tested approaches, and obtained results.