Skip to content

Orchestrated an agentic reinforcement-learning system leveraging OpenAI Gym environments with PPO/DQN agents, optimizing multi-warehouse replenishment to achieve 98.9 % service levels and 48 % cost reduction

License

Notifications You must be signed in to change notification settings

anumohan10/OptiStock-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentic Inventory Optimizer

Multi-Agent Reinforcement Learning for Multi-Warehouse Inventory Management

📄 Document & Video

📂 Full Project Report & Demo Video


📌 Overview

This project implements an Agentic Workflow System for optimizing multi-warehouse inventory operations using Reinforcement Learning (RL).
It integrates:

  • Value-Based Learning → Deep Q-Network (DQN)
  • Policy Gradient Methods → Proximal Policy Optimization (PPO)
  • Exploration Strategies → Upper Confidence Bound (UCB) policy selection
  • Custom Agentic Tools → Cost simulation, decision explanation, and warehouse Q&A

The goal is to minimize total inventory costs (holding + stockout + ordering) while maintaining high service levels under uncertain demand.


🚀 Features

  • Two RL approaches: DQN & PPO
  • Multi-agent orchestration with policy selection
  • Fallback mechanism for safety
  • Custom tools for:
    • Cost simulation
    • Decision explanation
    • Warehouse Q&A
  • Streamlit dashboard for visualization
  • Baseline comparisons with heuristic policies
  • Exportable reports with learning curves & performance breakdowns

📂 Repository Structure

agentic-inventory-optimizer/
├── agents/                  # RL agents, forecaster, orchestrator, policy selector
├── custom_tools/             # Cost simulation, dashboard export, decision explainer, QA
├── env/                      # Inventory environment & wrappers
├── demo/                     # Streamlit UI
├── rl/                       # Training & evaluation scripts
├── results/                  # Models, evaluation JSONs, and visualizations
├── tests/                    # Unit tests for agents & tools
└── README.md                 # This file

⚙️ Installation

  1. Clone the repo
git clone https://github.com/anumohan10/agentic-inventory-optimizer.git
cd agentic-inventory-optimizer
  1. Create virtual environment & install dependencies
python -m venv venv
source venv/bin/activate   # Linux/Mac
venv\Scripts\activate      # Windows
pip install -r requirements.txt

▶️ Usage

1. Train an RL Agent

DQN

python -m rl.train_rl_agent --algo dqn --episodes 10000     --target_service 0.92 --below_target_mult 12.0 --seed 42

PPO

python -m rl.train_rl_agent --algo ppo --episodes 8000     --target_service 0.92 --below_target_mult 8.0 --seed 0

2. Evaluate a Model

python -m rl.evaluate_agent --algo ppo --episodes 100     --model_path results/models/ppo_best_98_service.zip

3. Run the Dashboard

streamlit run demo/app.py

📊 Results

Policy Total Cost Service Level Notes
PPO $6,504 98.85% 🥇 Best
DQN $6,897 98.33% 🥈 Excellent
Heuristic ~$7,500–8,500 85–92% 📊 Baseline

🏗 Architecture Diagram

Architecture Diagram


📌 Key Achievements

  • DQN improved service from 69.7% → 98.33%
  • Cost reduction of 39% for DQN after tuning
  • PPO achieved optimal cost-service trade-off
  • Agentic orchestration with policy switching and fallback safety

📈 Future Improvements

  • Multi-agent RL (one per warehouse)
  • Continuous action spaces
  • Integration with real demand forecasting models
  • Transfer learning between warehouses
  • Testing with real-world supply chain datasets

📜 License

MIT License © 2025 [Anusree Mohanan]

About

Orchestrated an agentic reinforcement-learning system leveraging OpenAI Gym environments with PPO/DQN agents, optimizing multi-warehouse replenishment to achieve 98.9 % service levels and 48 % cost reduction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages