Dataset used: Wine Quality Dataset
Notebooks: Visualizations | Models
Predicts wine quality using a fine-tuned LightGBM model. It includes data preprocessing, model training, and evaluation, with experiment tracking and model management using MLflow.
- Data loading and preprocessing
- Model training using LightGBM
- Model evaluation with MAE and MSE metrics
- Experiment tracking and model management using MLflow
data_loader.py: Functions for loading and preprocessing the data.model.py: Function for defining the LightGBM model.train.py: Function for training the model and logging experiments with MLflow.config.py: Configuration file with hyperparameters and settings.main.py: Main script for running the entire pipeline.
Insights
- The mofe frequent wine quality ratings are 5, 6, and 7.
- The distribution of quality ratings is not symmetrical, it leans slightly to the right. This means there are more ratings on the higher end of the quality scale than on the lower end.
Insights
- The graph suggests that alcohol content is sinificant factor in determining wine quality, with higher alcohol content wines generally having higher quality ratings.
Insights
- The LightGBM model has the lowest MAE and MSE values, indicating that it is the best model for predicting wine quality.
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.


