-
Notifications
You must be signed in to change notification settings - Fork 2
Description
What are the limitations of the current standard?
Since GFSv16, the global-workflow has been moving from linking within the DATA directory to copying all of the needed files to and from COM (see bugzilla#1301, NOAA-EMC/global-workflow#712 for examples). This has caused many jobs related to the ensembles and data assimilation to run very long due to the time it takes to copy 80 ensemble members worth of files.
What changes are being proposed to the current standard?
A temporary COM space within the DATAROOT disk is being proposed. We could stage the ensemble member files for a particular cycle there and then each individual job would link to that temporary space within its DATA directory. This would allow the links to be confined to a single disk rather than across disks from DATAROOT to COM while also considerably speeding up job time.
How do these changes improve the current standard?
It allows for greater flexibility in the workflows and more efficiency in several jobs.
What are potential impacts by changing this standard?
Systems with high data volume would benefit from this change, including GFS and RRFS.
How will this standard be enforced?
As part of the SPA implementation code review after code handoff
Additional context or notes (optional)
@JacobCarley-NOAA @DavidHuber-NOAA @JessicaMeixner-NOAA @RuiyuSun - Please add details that I may have missed