Efficient inventory management relies on accurate demand forecasts and smart ordering policies. In this project, ML regression models predict demand, and a linear programming approach with PuLP determines the optimal ordering strategy.
Demand-Forecasting-and-Inventory-Optimization/
│
├── data/
│ ├── raw/ # original dataset (Walmart)
│ └── processed/ # cleaned dataset and feature engineered
│
├── notebooks and models/
│ ├── 01_EDA.ipynb # data cleaning and feature engineering
│ ├── 02_Forecasting.ipynb # demand forecasting using ML regression models
│ └── 03_Optimization.ipynb # inventory optimization with PuLP
│
├── results/
│ ├── figures/ # plots (EDA, forecast vs actual, etc.)
│ └── outputs/ # optimization CSV outputs, video, GIF
│
├── LICENSE
├── README.md
├── dashboard.py # interactive Dash dashboard
└── requirements.txt
The full dataset comes from the Kaggle competition Walmart Recruiting - Store Sales Forecasting.
Due to file size restrictions, raw data, processed data, and models are not stored in this repository.
Place the downloaded dataset in this folder before running the notebooks: data/raw
Install the required dependencies:
pip install -r requirements.txt- Place raw data in
data/raw/ - Run the notebooks in order (01 → 02 → 03)
- The optimized ordering policy is saved under
data/processed/
-
Forecast accuracy (RMSE): I trained several linear and tree-based regression models. The best-performing model was a GradientBoostingRegressor, which I further fine-tuned via hyperparameter optimization. On the test set, the model achieved an RMSE of 2,795.82 units (~17% deviation from the average actual demand). This level of accuracy supports reliable inventory decisions.
-
Inventory optimization results: The model minimizes total cost by balancing purchasing, holding, and shortage penalties. I applied Operations Research principles to analyze the optimization problem using linear programming in three steps:
- Business problem: define the operational objective and constraints (e.g., budget, warehouse capacity, service levels).
- Mathematical model: formulate the parameters, decision variables, and constraints in a structured LP framework.
- Computational model: use the demand forecasts from notebook
02_Forecasting.ipynbto determine the optimal order quantities in each period to minimize total cost.
This plot shows order quantities (Q), inventory levels (I), shortages (S), and forecasted demand over a rolling 24-week horizon, illustrating the output of the optimization model.
To provide a practical tool for business decision-making, an interactive dashboard was created using Dash. The dashboard allows users to explore different inventory scenarios by adjusting key parameters such as purchase cost, holding rate, shortage penalty, lead time, budget, and maximum inventory capacity.
- Dashboard file:
dashboard.py
Example output: the animated GIF above shows a preview of the dashboard in action

