Skip to content

brown-ccv/oscar-gpu-storage-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

OSCAR GPU & Storage Capacity Forecasting

R-based analysis for forecasting GPU demand and storage capacity needs using historical SLURM job data and polynomial regression.

Repository Structure

oscar-gpu-storage-forecasting/
├── README.md
├── src/
│   └── gpu_needs_forecast.R
└── data/
    ├── slurm_gpu_daily_counts_2021-2023.csv
    └── slurm_gpu_daily_counts_2023-2025.csv

Data

SLURM GPU Job Data

The CSV files in data/ contain daily GPU job metrics extracted from SLURM:

Column Description
job_date Date of the jobs
gpu_jobs_count Number of GPU jobs submitted
total_gpus_requested Total GPUs requested across all jobs
avg_gpus_per_job Average GPUs requested per job
max_gpus_requested Maximum GPUs requested in a single job

Storage Data

Storage capacity data (2020-2026) is defined directly in the analysis script.

Analysis Overview

GPU Demand Forecasting

  • Aggregates daily job counts to yearly median values
  • Fits a quadratic polynomial regression model
  • Generates predictions for future years with confidence intervals
  • Calculates GPU capacity ratios to inform hardware planning

Storage Capacity Forecasting

  • Compares linear and polynomial regression models
  • Evaluates model fit and selects the best model
  • Produces predictions with uncertainty estimates

Visualizations

The analysis generates:

  • GPU demand plot: Historical yearly medians with regression line and future predictions
  • Storage capacity plot: Historical storage growth with trend line and projections
  • Residual diagnostics: Model comparison plots for evaluating fit quality

Dependencies

install.packages(c("ggplot2", "dplyr", "lubridate", "gridExtra"))

Usage

  1. Open src/gpu_needs_forecast.R in R or RStudio

  2. Update file paths: The script contains hardcoded paths that need to be modified to match your local setup. Update the read.csv() calls near the top of the script to point to the CSV files in your data/ directory.

  3. Run the script to generate forecasts and visualizations

Outputs

Running the analysis produces:

  • Model summary statistics printed to the console
  • GPU demand forecast visualization
  • Storage capacity forecast visualization
  • Model comparison metrics and residual plots

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages