Skip to content

Bug report using gcc and impi on NOAA hera system #123

@thomas-robinson

Description

@thomas-robinson

While trying to run an e4s-cl init I received an error that said it was an e4s-cl bug, and to report the contents of a debug file on github. Below is the pasted contents of the file:

$ cat /home/Thomas.Robinson/.local/e4s_cl/logs/debug_log
 [Debug root:483] 
########################################################################################################################################################
E4S CONTAINER LAUNCHER LOGGING INITIALIZED

Timestamp         : 2024-09-03 13:17:23.794833
Hostname          : hfe05
Platform          : Linux-4.18.0-477.27.1.el8_8.88ciq_lts.0.1.x86_64-x86_64-with-glibc2.28
Version           : 1.0.5.dev1+g35e5e6a
Python Version    : 3.12.4
Working Directory : /scratch2/GFDL/e4s/Thomas.Robinson/containers
Terminal Size     : 152x32
Frozen            : False
Log ID            : 0ad97a938a1a609713e75e9db3edb9d99134fa24084f2412fc43ef7cb0037359
########################################################################################################################################################

[Debug e4s_cl.cli.commands.__main__:77] e4s-cl args: Namespace(command='init', options=['--profile', 'gfdl2024.01', '--launcher', 'srun', '--backend', 'singularity', '--image', '/scratch2/GFDL/e4s/Thomas.Robinson/containers/gfdlsoftware_2024.01-gcc13.sif'], dry_run=None)
[Debug e4s_cl.cli.commands.init:77] e4s-cl init args: Namespace(profile_name='gfdl2024.01', launcher='/apps/slurm/default/bin/srun', backend='singularity', image='/scratch2/GFDL/e4s/Thomas.Robinson/containers/gfdlsoftware_2024.01-gcc13.sif', cmd=[])
[Debug e4s_cl.cf.storage.local_file:50] '/home/Thomas.Robinson/.local/e4s_cl/user.json' opened read-write
[Debug e4s_cl.cf.storage.local_file:170] Initialized user database '/home/Thomas.Robinson/.local/e4s_cl/user.json'
[+] Tracing MPI execution using:
[+] '/apps/slurm/default/bin/srun /scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester'
[Debug e4s_cl.cli.commands.profile.detect:77] e4s-cl profile detect args: Namespace(profile_name=None, cmd=['/apps/slurm/default/bin/srun', '/scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester'])
[Debug e4s_cl.util:211] Running with parent status: ['/apps/slurm/default/bin/srun', '/scratch2/GFDL/e4s/bin/conda/bin/python', '/scratch2/GFDL/e4s/bin/bin/e4s-cl', 'profile', 'detect', '/scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester']
Failed to determine necessary libraries: program exited with code 1
[+] Attach <PtraceProcess #2397590> to debugger
[+] Set <PtraceProcess #2397590> options to 1
[+] Created profile gfdl2024.01
[Debug root:483] 
########################################################################################################################################################
E4S CONTAINER LAUNCHER LOGGING INITIALIZED

Timestamp         : 2024-09-03 13:18:38.549017
Hostname          : h11c53
Platform          : Linux-4.18.0-477.27.1.el8_8.88ciq_lts.0.1.x86_64-x86_64-with-glibc2.28
Version           : 1.0.5.dev1+g35e5e6a
Python Version    : 3.12.4
Working Directory : /scratch2/GFDL/e4s/Thomas.Robinson/containers
Terminal Size     : 152x32
Frozen            : False
Log ID            : befd6bd2404fe811dc2d9f4e42d7d451d4c2ba762934fc55b94067328a6687f0
########################################################################################################################################################

[Debug e4s_cl.cli.commands.__main__:77] e4s-cl args: Namespace(command='init', options=['--profile', 'gfdl2024.01', '--launcher', 'srun', '--backend', 'singularity', '--image', '/scratch2/GFDL/e4s/Thomas.Robinson/containers/gfdlsoftware_2024.01-gcc13.sif'], dry_run=None)
[Debug e4s_cl.cli.commands.init:77] e4s-cl init args: Namespace(profile_name='gfdl2024.01', launcher='/apps/slurm/default/bin/srun', backend='singularity', image='/scratch2/GFDL/e4s/Thomas.Robinson/containers/gfdlsoftware_2024.01-gcc13.sif', cmd=[])
[Debug e4s_cl.cf.storage.local_file:50] '/home/Thomas.Robinson/.local/e4s_cl/user.json' opened read-write
[Debug e4s_cl.cf.storage.local_file:170] Initialized user database '/home/Thomas.Robinson/.local/e4s_cl/user.json'
[+] Tracing MPI execution using:
[+] '/apps/slurm/default/bin/srun /scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester'
[Debug e4s_cl.cli.commands.profile.detect:77] e4s-cl profile detect args: Namespace(profile_name=None, cmd=['/apps/slurm/default/bin/srun', '/scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester'])
[Debug e4s_cl.util:211] Running with parent status: ['/apps/slurm/default/bin/srun', '/scratch2/GFDL/e4s/bin/conda/bin/python', '/scratch2/GFDL/e4s/bin/bin/e4s-cl', 'profile', 'detect', '/scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester']
[+] Attach <PtraceProcess #1470539> to debugger
[+] Set <PtraceProcess #1470539> options to 1
[+] Created profile gfdl2024.01
[Debug root:483] 
########################################################################################################################################################
E4S CONTAINER LAUNCHER LOGGING INITIALIZED

Timestamp         : 2024-09-03 13:20:20.583304
Hostname          : h11c53
Platform          : Linux-4.18.0-477.27.1.el8_8.88ciq_lts.0.1.x86_64-x86_64-with-glibc2.28
Version           : 1.0.5.dev1+g35e5e6a
Python Version    : 3.12.4
Working Directory : /scratch2/GFDL/e4s/Thomas.Robinson/containers
Terminal Size     : 152x32
Frozen            : False
Log ID            : 0762531179c2b4d0051837a22b8316642f05373133dfabc70dca9d4f1093cea8
########################################################################################################################################################

[Debug e4s_cl.cli.commands.__main__:77] e4s-cl args: Namespace(command='init', options=['--profile', 'gfdl2024.01', '--launcher', 'srun', '--backend', 'singularity', '--image', '/scratch2/GFDL/e4s/Thomas.Robinson/containers/gfdlsoftware_2024.01-gcc13.sif'], dry_run=None)
[Debug e4s_cl.cli.commands.init:77] e4s-cl init args: Namespace(profile_name='gfdl2024.01', launcher='/apps/slurm/default/bin/srun', backend='singularity', image='/scratch2/GFDL/e4s/Thomas.Robinson/containers/gfdlsoftware_2024.01-gcc13.sif', cmd=[])
[Debug e4s_cl.cf.storage.local_file:50] '/home/Thomas.Robinson/.local/e4s_cl/user.json' opened read-write
[Debug e4s_cl.cf.storage.local_file:170] Initialized user database '/home/Thomas.Robinson/.local/e4s_cl/user.json'
[+] Tracing MPI execution using:
[+] '/apps/slurm/default/bin/srun /scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester'
[Debug e4s_cl.cli.commands.profile.detect:77] e4s-cl profile detect args: Namespace(profile_name=None, cmd=['/apps/slurm/default/bin/srun', '/scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester'])
[Debug e4s_cl.util:211] Running with parent status: ['/apps/slurm/default/bin/srun', '/scratch2/GFDL/e4s/bin/conda/bin/python', '/scratch2/GFDL/e4s/bin/bin/e4s-cl', 'profile', 'detect', '/scratch2/GFDL/e4s/bin/conda/bin/e4s-cl-mpi-tester']
Failed to determine necessary libraries: program exited with code 156
[+] Attach <PtraceProcess #1470679> to debugger
[+] Set <PtraceProcess #1470679> options to 1
[+] Created profile gfdl2024.01

Here are the modules I have loaded:

$ module list
Currently Loaded Modules:
  1) gnu/9.2.0   2) impi/2020

My container is using gcc 13 and mpich installed with spack.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions