[rrfs-mpas-jedi] Prep ic para #1279

chunhuazhou · 2026-01-21T15:41:12Z

DESCRIPTION OF CHANGES:

The ensemble prep_ic is taking too much time for the real-time runs on Ursa (could be as long as ~10 min per member in some cases), and even causing dead prep_ic jobs after finishing some members. This PR adds the capability to run prep_ic parallelly for ensemble members and significantly reduce the run time to less than 5 min for all the members. This has been tested in retro mode and is now in the real-time run.

scripts/exrrfs_prep_ic.sh

hu5970 · 2026-01-22T00:35:57Z

workflow/rocoto_funcs/prep_ic.py

+        datadep_prod = f'''\n        <datadep age="00:05:00"><cyclestr offset="-{cyc_interval}:00:00">&COMROOT;/&NET;/&rrfs_ver;/&RUN;.@Y@m@d/@H/fcst/&WGF;/</cyclestr><cyclestr>mpasout.@Y-@m-@d_@H.00.00.nc</cyclestr></datadep>'''
+
    datadep_spinup = f'''\n        <taskdep task="fcst_spinup" cycle_offset="-1:00:00"/>'''
    if spinup_mode == 0:  # no parallel spinup cycles


Do you know what is this for?

@hu5970 We use one prep_ic task for the following spin up situations:

cold, spinup_mode = 0 # regular prod cycles, i.e no spin up cycles in an experiment cold, spinup_mode = 1 # spin up, cold start warm, spinup_mode = -1 # spin up, continue cycling warm, spinup_mode = -1, prod_switch # prod switching from spinup warm, spinup_mode = -1, regular # prod parallel to spinup, continue cycling

Check this slide for more details:
https://docs.google.com/presentation/d/1HPx2LzX8Hf9Imztl4OpyXdyTTWoN4hBOId4AP7bhyYM/edit?slide=id.g332010f5a43_4_374#slide=id.g332010f5a43_4_374

spinup_mode needs more discussion. We learn from RRFSv1 that the cycle mode could be very complex and we need to set a good parameter to control all the possible cycle tasks: det spin-up, det prod, enkf prod, enkf spin-up, blending etc. Let though if we can define cycle_mode (0-99) to help?

This workflow is different from RRFSv1, as it will always support coldstart-only forecast experiments and non-spinup experiments. I think we are good for now. We can definitely talk more when we run into any issues.

hu5970 · 2026-01-22T00:47:18Z

scripts/exrrfs_prep_ic.sh

+#
+echo "===== CMDFILE ====="
+cat  "$CMDFILE"
+xargs -I {} -P "${SLURM_NTASKS}" sh -c '{}' < "${CMDFILE}"


SLURM_NTASKS is a SLURM related number. Do we have better way to set parallel core number?

@chunhuazhou NTASKS is expected to be a defined env variable from the job card.
In the rocoto workflow, it is already defined in launch.sh

rrfs-workflow/workflow/sideload/launch.sh

Line 15 in 3ce4c70

export NTASKS=${SLURM_NTASKS}

Also, if we want to move forward with running serial copying in parallel, we would want to be generic, not binding all logics to SLURM only.
We already have a generic rank_run tool in rrfs-workflow to be used. I can help with this.

But at this moment, I think 30 copy may work well if we use --exclusive or request sufficient memory for each task

Should use "NTASKS" here.

rank_run needs extra lib. Using xargs should be good enough for now.

@hu5970 What do you mean rank_run nees extra lib?

We should not develop a script working only for SLURM.

xargs is linux command, not for SLURM. It works on wcoss2. Also, WCOSS2 has its own way to run parallel command in within one node.

Ah, thanks for the reminder. I did not pay much attention to this part. I had thought it is similar to this:

rrfs-workflow/scripts/exrrfs_ensmean.sh

Line 89 in 3ce4c70

srun --multi-prog "${CMDFILE}".multi

With that said, xargs cannot distribute tasks across multiple nodes.
For example, when we run NA3km or global-15-3km or more ensembles with few cores, we may need to use 2 or more nodes to copy files in parallel because the file size is much larger and we need more memories for each copying.

rank_run is a simple replacement tool to NCO's CFP in non-NCO machines. It does real parallelism and can use multiple nodes. It has no other library dependencies.

hu5970 · 2026-01-22T00:48:03Z

scripts/exrrfs_prep_ic.sh


-exit 0
+done
+#


This runs after surface cycle, right? If so, surface cycle will fail.

guoqing-noaa · 2026-01-22T07:18:35Z

@chunhuazhou Currently, the ensemble workflow will launch 30 prep_ic tasks, right? If so, I think it should be as fast as the changes proposed in this PR.

I guess the current issue (include the dead task) is because the copy process needs sufficient memory to complete successfully but prep_ic does not request sufficient memory or to be exclusive on Ursa (Ursa is very aggressive in sharing a node as much as possible among different tasks).

hu5970 · 2026-01-22T15:52:09Z

Using one node to copy the files is a waste of resources. We need to avoid such setup.

guoqing-noaa · 2026-01-22T16:06:50Z

Using one node to copy the files is a waste of resources. We need to avoid such setup.

@hu5970 we don't have to use one node to do copy.
We can just request enough memory just as the changes in this PR and then let SLURM/PBS to manage how to allocate resources efficiently.

Also, I think initially we plan to let prep_ic to do more tasks such as surface updating/soilSurgery etc. So using one node is not that bad in practice, especially if it runs fast.

guoqing-noaa · 2026-01-22T16:10:30Z

There is one drawback to "manually" do 30 parallel copies in 1 or 2 nodes. Different HPCs have different core/memory per node. 30 parallel "manual" copying may work on Ursa (192 cores) but may NOT work on Hera (40 cores).

…eble parallel processing

scripts/exrrfs_prep_ic.sh

hu5970 · 2026-01-22T19:31:38Z

There is one drawback to "manually" do 30 parallel copies in 1 or 2 nodes. Different HPCs have different core/memory per node. 30 parallel "manual" copying may work on Ursa (192 cores) but may NOT work on Hera (40 cores).

We do not need to worry about this. Xarge will run with the core number after -P to run command parallel with the core number and then run rest of the command after the first set finished. So, just the core number ad -P as the core number pernode.

hu5970 · 2026-01-22T19:33:11Z

Using one node to copy the files is a waste of resources. We need to avoid such setup.

@hu5970 we don't have to use one node to do copy. We can just request enough memory just as the changes in this PR and then let SLURM/PBS to manage how to allocate resources efficiently.

Also, I think initially we plan to let prep_ic to do more tasks such as surface updating/soilSurgery etc. So using one node is not that bad in practice, especially if it runs fast.

Those surface update and soil surgery are all small single core program. And most of them is actually running for the deterministic cycle only.

guoqing-noaa · 2026-01-22T19:56:10Z

scripts/exrrfs_prep_ic.sh

+    export CMDFILE="${DATA}/script_prep_ic_0.sh"
+  fi
+
+  mkdir -p "$(dirname "$CMDFILE")"


${DATA} should be always available in the ex-script, no need to mkdir -p here

guoqing-noaa · 2026-01-22T20:08:15Z

scripts/exrrfs_prep_ic.sh

 fi

-exit 0
+done


We need to add two spaces to indent lines 42-171 correctly.

guoqing-noaa · 2026-01-22T20:10:47Z

scripts/exrrfs_prep_ic.sh

+for memdir in "${mem_list[@]}"; do
+  # Determine path
+  if [[ ${#memdir} -gt 1 ]]; then
+    umbrella_prep_ic_data="${UMBRELLA_PREP_IC_DATA}${memdir}"


Suggest changing umbrella_prep_ic_data to umbrella_prep_ic_mem to better distinguish it from UMBRELLA_PREP_IC_DATA

guoqing-noaa · 2026-01-22T20:14:30Z

scripts/exrrfs_prep_ic.sh

+if [[ "${ENS_SIZE:-0}" -gt 2 ]]; then
+  mapfile -t mem_list < <(printf "/mem%03d\n" $(seq 1 "$ENS_SIZE"))
+else
+  mem_list=("/") # if determinitic


This will create a double / situation in line 64:
thisfile=${COMINrrfs}/${RUN}.${PDY}/${cyc}/ic/${WGF}${memdir}/init.nc
It will generate something like ..../ic/det//init.nc. This is expected to be avoided per the NCO standard.

guoqing-noaa · 2026-01-22T20:15:53Z

scripts/exrrfs_prep_ic.sh

+  if [[ ${#memdir} -gt 1 ]]; then
+    umbrella_prep_ic_data="${UMBRELLA_PREP_IC_DATA}${memdir}"
+    mkdir -p "${COMOUT}/prep_ic/${WGF}${memdir}"
+    pid=$((10#${memdir: -2}-1))


I would think it is more straightforward to generate a list of member numbers fist and add "/" when it is needed.
Let me try a case and post my example here.

guoqing-noaa · 2026-01-22T20:32:10Z

scripts/exrrfs_prep_ic.sh

+  # Determine path
+  if [[ ${#memdir} -gt 1 ]]; then
+    umbrella_prep_ic_data="${UMBRELLA_PREP_IC_DATA}${memdir}"
+    mkdir -p "${COMOUT}/prep_ic/${WGF}${memdir}"


Why we create COMOUT directories for each member's prep_ic? Will we save data there?

guoqing-noaa · 2026-01-22T20:52:07Z

scripts/exrrfs_prep_ic.sh

+  : > "$CMDFILE"
+
+  # Create directory safely
+  mkdir -p "${umbrella_prep_ic_data}"


@chunhuazhou FYI, here is an alternate example for lines 34-57:

if (( "${ENS_SIZE:-0}" > 1 )); then mapfile -t mem_list < <(printf "%03d\n" $(seq 1 "$ENS_SIZE")) else mem_list=("000") # if determinitic fi for index in "${mem_list[@]}"; do # Determine path if (( 10#${index} == 0 )); then memdir="" umbrella_prep_ic_mem="${UMBRELLA_PREP_IC_DATA}" export CMDFILE="${DATA}/script_prep_ic_0.sh" else memdir="/mem${index}" umbrella_prep_ic_mem="${UMBRELLA_PREP_IC_DATA}${memdir}" mkdir -p "${umbrella_prep_ic_mem}" pid=$((10#${index}-1)) export CMDFILE="${DATA}/script_prep_ic_${pid}.sh" fi echo $CMDFILE, $memdir done

hu5970 · 2026-01-24T00:27:44Z

scripts/exrrfs_prep_ic.sh

@@ -81,6 +109,7 @@ fi
 #


Surface cycle have no relation with background copy. Better to separate to two sections?

chunhuazhou added 2 commits January 21, 2026 15:32

Run prep_ic parallelly for ensemble members to avoid long disk I/O time

aa2ec2c

remove changes not needed

f474ebe

chunhuazhou requested review from BenjaminBlake-NOAA, MatthewPyle-NOAA and ShunLiu-NOAA as code owners January 21, 2026 15:41

github-advanced-security bot found potential problems Jan 21, 2026

View reviewed changes

scripts/exrrfs_prep_ic.sh Fixed Show fixed Hide fixed

Addressing linter check issues

061ecbc

chunhuazhou requested a review from hu5970 January 21, 2026 18:12

hu5970 reviewed Jan 22, 2026

View reviewed changes

chunhuazhou marked this pull request as draft January 22, 2026 14:36

Switch to use rank_run.x to handle multiple steps in prep_ic for ensm…

852fecc

…eble parallel processing

github-advanced-security bot found potential problems Jan 22, 2026

View reviewed changes

scripts/exrrfs_prep_ic.sh Fixed Show fixed Hide fixed

guoqing-noaa reviewed Jan 22, 2026

View reviewed changes

scripts/exrrfs_prep_ic.sh Outdated

fi

exit 0

done

Copy link

Contributor

guoqing-noaa Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add two spaces to indent lines 42-171 correctly.

guoqing-noaa reviewed Jan 22, 2026

View reviewed changes

As suggested by @hu5970, switch back from rank_run.x to xargs

c440403

chunhuazhou force-pushed the prep_ic_para branch from 14692af to c440403 Compare January 23, 2026 04:28

hu5970 reviewed Jan 24, 2026

View reviewed changes

               fi
-              exit 0
+              done

[rrfs-mpas-jedi] Prep ic para #1279

Are you sure you want to change the base?

[rrfs-mpas-jedi] Prep ic para #1279

Uh oh!

Conversation

chunhuazhou commented Jan 21, 2026

DESCRIPTION OF CHANGES:

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guoqing-noaa Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guoqing-noaa Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guoqing-noaa commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hu5970 commented Jan 22, 2026

Uh oh!

guoqing-noaa commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guoqing-noaa commented Jan 22, 2026

Uh oh!

Uh oh!

hu5970 commented Jan 22, 2026

Uh oh!

hu5970 commented Jan 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guoqing-noaa Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guoqing-noaa Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

guoqing-noaa Jan 22, 2026 •

edited

Loading

guoqing-noaa Jan 22, 2026 •

edited

Loading

guoqing-noaa commented Jan 22, 2026 •

edited

Loading

guoqing-noaa commented Jan 22, 2026 •

edited

Loading

guoqing-noaa Jan 22, 2026 •

edited

Loading

guoqing-noaa Jan 22, 2026 •

edited

Loading