MLflow Machine Learning Lifecycle System - MNIST Dataset
This repository contains the setup and implementation for managing the full machine learning lifecycle using MLflow. It covers everything from training models, logging experiments, storing artifacts, to serving models and handling inference requests.
System Overview The system is designed to streamline operations in a machine learning workflow which includes:
- Experiment tracking
- Model registration and storage
- Model serving for inference
The components of the system include:
- Client Environments: Where models are developed and initial tests are conducted.
- MLflow Tracking Server: Central server for logging and querying experiment data.
- Artifact Store: Storage for model artifacts.
- Metadata Database: Database to store experiment and model metadata.
- MLflow Model Registry: Service for model versioning and stage management.
- Model Serving: Component that deploys models for inference.
Prerequisites
Before setting up the system, ensure the following prerequisites are met:
- Python 3.6 or newer
- Docker
- Kubernetes (optional, for scalable deployment)
- Mlflow
See the architecture file to get the flow diagram of whole flow.
Setup and Installation
- MLflow Tracking Server
To set up the MLflow tracking server:
mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./artifacts --host 0.0.0.0
- Model Training
Use the provided Python scripts to log experiments to the MLflow server:
python train.py
- Model Serving
Deploy the model using a Docker container:
docker build -t mlflow-model-serving . docker run -p 9201:9201 mlflow-model-serving
Usage
To interact with the system, use the following commands:
-
Logging Experiments: Run the train_model.py script.
-
Viewing Experiments: Access the MLflow UI at http://localhost:5000.
-
Model Inference: Send POST requests to the model serving endpoint:
curl -X POST -H "Content-Type: application/json" -d '{"image": [...image data... ]}' http://localhost:9201/predict