Skip to content

Real-world Markov Process Simulation using Divvy Bike Data (GBFS), demonstrating SAS-to-Python migration for operational analytics and forecasting.

Notifications You must be signed in to change notification settings

rajatd23/divvy-markov-realdata

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Real-World Markov Process Simulation (Divvy GBFS โ†’ Python)

This project demonstrates an end-to-end real-world Markov process simulation built entirely in Python, using live operational data from the Divvy (Chicago) bike-share system.
The goal is to show how SAS-based Markov simulations and reporting workflows can be migrated to Python while preserving analytical rigor, reproducibility, and decision-making value.


๐Ÿ” What This Project Does

  1. Collects real-world data from Divvyโ€™s public GBFS (General Bikeshare Feed Specification) API
  2. Converts numerical station availability data into discrete states
  3. Learns transition probabilities from historical data
  4. Builds a Markov transition matrix from real observations
  5. Simulates future system behavior using the learned model
  6. Exports SAS-like analytical outputs (CSV tables, plots, reports)

This mirrors how Markov models are used in utilities, transportation, logistics, finance, and reliability engineering.


๐ŸŒ Real-World Motivation

Many organizations still rely on SAS to:

  • model system behavior over time
  • simulate risk and failure
  • support operational and financial forecasting

This project shows how the same logic can be:

  • implemented in Python
  • automated
  • validated using real operational data
  • exported to modern analytics tools (Power BI, Excel, dashboards)

๐Ÿง  Conceptual Overview

Each Divvy bike station is treated as an entity.

At each time snapshot, the station is classified into one of four states based on bike availability:

State Meaning
EMPTY No bikes available
LOW Low availability
MEDIUM Moderate availability
HIGH High availability

By observing how stations move between these states over time, we:

  • learn transition probabilities
  • construct a Markov model
  • simulate future system behavior

๐Ÿ“Š Outputs Generated

After running the pipeline, the following outputs are created:

  • transition_counts_real.csv
    โ†’ Frequency of state-to-state transitions (PROC FREQ equivalent)

  • transition_probs_real.csv
    โ†’ Learned Markov transition matrix

  • simulated_occupancy.csv
    โ†’ State distribution over time (simulation result)

  • simulated_occupancy.png
    โ†’ Visualization of state evolution

  • report.md
    โ†’ Auto-generated analytical summary

These outputs are ready for dashboards, forecasting, and decision support.


โ–ถ๏ธ How to Run

1. Install dependencies

pip install -r requirements.txt

2. Collect real-world data (run multiple times)

python src/collect_divvy.py

3. python src/run_end_to_end.py

python src/run_end_to_end.py

About

Real-world Markov Process Simulation using Divvy Bike Data (GBFS), demonstrating SAS-to-Python migration for operational analytics and forecasting.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages