Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: CHPL_LAUNCHER=mpirun4ofi does not work on MacOS with mpirun (Open MPI) 5.0.6 #26660

Open
jabraham17 opened this issue Feb 5, 2025 · 0 comments

Comments

@jabraham17
Copy link
Member

Summary of Problem

I tried to test the OFI comm layer by running oversubscribed on my Mac and ran into some issues with the launcher.

I set the following env vars, then tried to run hello world

  export CHPL_COMM=ofi
  export CHPL_COMM_OFI_OOB=mpi
  export CHPL_LAUNCHER=mpirun4ofi
  export MPI_DIR=/opt/homebrew/opt/open-mpi/
  export CHPL_RT_OVERSUBSCRIBED=yes

This errors out with the following message

./a.out -nl 1 -v
FI_SOCKETS_PE_WAITTIME=0 HUGETLB_MORECORE=no mpirun -np 1 -map-by ppr:1:node -map-by node:oversubscribe -bind-to none -mca mpi_yield_when_idle 1 a.out_real -nl 1 -v
--------------------------------------------------------------------------
ERROR: The "map-by" command line option was listed more than once on the command line.
Only one instance of this option is permitted.
Please correct your command line.
--------------------------------------------------------------------------

I was able to work around this by manually launching the _real executable as

NUMLOCALES=1
FI_SOCKETS_PE_WAITTIME=0 HUGETLB_MORECORE=no mpirun -map-by ppr:$NUMLOCALES:node:oversubscribe -bind-to none -mca mpi_yield_when_idle 1 a.out_real -nl $NUMLOCALES -v

I think we can adjust the mpirun4ofi to use this sequence of commands to make it work for open-mpi on Mac, but I don't know what effect that would have on other platforms (or other versions of mpi).

Configuration Information

  • Output of chpl --version: 2.4 pre release
  • Output of $CHPL_HOME/util/printchplenv --anonymize:
CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: llvm
CHPL_TARGET_ARCH: arm64
CHPL_TARGET_CPU: native *
CHPL_LOCALE_MODEL: flat
CHPL_COMM: ofi *
  CHPL_LIBFABRIC: bundled
  CHPL_COMM_OFI_OOB: mpi *
CHPL_TASKS: qthreads
CHPL_LAUNCHER: mpirun4ofi *
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_TARGET_MEM: jemalloc *
CHPL_ATOMICS: cstdlib
  CHPL_NETWORK_ATOMICS: ofi
CHPL_GMP: system *
CHPL_HWLOC: system *
CHPL_RE2: bundled *
CHPL_LLVM: system
CHPL_AUX_FILESYS: none
  • Back-end compiler and version, e.g. gcc --version or clang --version: LLVM 19, clang 14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant