End-to-end analysis of NYC Yellow Taxi trip data using PySpark, covering data cleaning, EDA, feature engineering, demand patterns, and fare prediction modeling.
-
Updated
Jan 4, 2026 - Jupyter Notebook
End-to-end analysis of NYC Yellow Taxi trip data using PySpark, covering data cleaning, EDA, feature engineering, demand patterns, and fare prediction modeling.
Real-world Markov Process Simulation using Divvy Bike Data (GBFS), demonstrating SAS-to-Python migration for operational analytics and forecasting.
fmCSA carrier data extraction tool
Difference-in-Differences analysis of bus-lane policies and ridership trends in Israel.
Develop a predictive model to accurately forecast hourly traffic volumes at different road junctions based on historical traffic data
Power BI dashboard analysing rail operations: ticket sales, delays, revenue, and customer behaviour.
Apache Airflow ETL pipeline that consolidates multi-format toll road traffic data (CSV, TSV, fixed-width) into a unified, transformed dataset using BashOperators and scheduled workflows.
🚖 Analyze NYC Yellow Taxi trip data for fare prediction and demand insights using PySpark, enabling efficient data processing and accurate modeling.
Add a description, image, and links to the transportation-analytics topic page so that developers can more easily learn about it.
To associate your repository with the transportation-analytics topic, visit your repo's landing page and select "manage topics."