You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We generally recommend users use ssh-based, rather than mpirun-based, options when launching Chapel programs using conduits other than mpi (like ofi or ibv), the reason being that MPI can consume resources that can either hurt Chapel performance or simply lock GASNet out of being able to use the network. For example, we've had a few users on Omnipath networks hit the error:
*** FATAL ERROR (proc 0): in gasnetc_ofi_init() at /third-party/gasnet/gasnet-src/ofi-conduit/gasnet_ofi.c:1336: fi_endpoint for rdma failed: -22(Invalid argument)
Again, if using the ssh-spawner is an option, that is often the most straightforward path forward to avoid MPI overheads. However, if it is not an option for some reason, it is preferable to have mpirun utilize TCP/IP to avoid contention for key network resources.
In the Build-time Configuration section of GASNet's documentation in https://bitbucket.org/berkeleylab/gasnet/src/master/other/mpi-spawner/README, the GASNet developers list options that can be used at configuration or execution time to request that MPI do this. For example, when using OpenMPI, two options are to:
set OMPI_MCA_btl=tcp,self in the environment (good for a quick check, annoying to have to do every time)
pass --mca btl tcp,self to mpirun, for example by setting it as part of MPIRUN_CMD at GASNet configuration time (see third-party/gasnet/gasnet-src/mpi-conduit/README for more about setting MPIRUN_CMD
This issue is here to:
capture a TODO item to update our GASNet-related documentation, particularly when the mpi spawner is mentioned, to mention this concern, workarounds, and possibly even error message (to assist with Google searching)
capture the information itself (should someone be searching GitHub issues for the error message)
The text was updated successfully, but these errors were encountered:
We generally recommend users use
ssh
-based, rather thanmpirun
-based, options when launching Chapel programs using conduits other thanmpi
(likeofi
oribv
), the reason being that MPI can consume resources that can either hurt Chapel performance or simply lock GASNet out of being able to use the network. For example, we've had a few users on Omnipath networks hit the error:Again, if using the ssh-spawner is an option, that is often the most straightforward path forward to avoid MPI overheads. However, if it is not an option for some reason, it is preferable to have mpirun utilize TCP/IP to avoid contention for key network resources.
In the
Build-time Configuration
section of GASNet's documentation in https://bitbucket.org/berkeleylab/gasnet/src/master/other/mpi-spawner/README, the GASNet developers list options that can be used at configuration or execution time to request that MPI do this. For example, when using OpenMPI, two options are to:OMPI_MCA_btl=tcp,self
in the environment (good for a quick check, annoying to have to do every time)--mca btl tcp,self
tompirun
, for example by setting it as part ofMPIRUN_CMD
at GASNet configuration time (seethird-party/gasnet/gasnet-src/mpi-conduit/README
for more about settingMPIRUN_CMD
This issue is here to:
The text was updated successfully, but these errors were encountered: