This post describes how to run the quantum chemical software package orca on particular clusters.
The problem
Orca’s parallel implementation relies on a particular OpenMPI version (2.0.2) which is called via a system call to mpirun. Now some clusters do not support mpirun, as they run in SLURM native mode. This means that the queueing system SLURM not only allocates resources to jobs, but also creates the MPI environment. The latter one used to be the task of mpirun. The error message you might get reads
-------------------------------------------------------------------------- mpirun is not a supported launcher on Cray XC using Native SLURM. srun must be used to launch jobs on these systems. -------------------------------------------------------------------------- [file orca_main/gtoint.cpp, line 137]: ORCA finished by error termination in ORCA_GTOInt
Just as the message says, you should use srun (part of SLURM) instead. Alas, the calls to mpirun are hardcoded in orca. The solution is a wrapper script similar to the suggestions here (requires registration). But now the wrapper script also has to convert the arguments taken by mpirun to those taken by srun.
The solution
- Create a file mpirun in a separate folder, e.g. ~/bin with the contents
#$/bin/sh echo "MPIWRAPPER IN" "${@}" echo "MPIWRAPPER OUT" $(echo "${@}" | sed 's/^-np/-n/') exec srun --mpi=openmpi $(echo "${@}" | sed 's/^-np/-n/')
- Make it executable with chmod +x /path/to/file/mpirun.
- In your jobscript (!) prepend this directory to PATH:
export PATH="/path/to/custom/mpirun:$PATH"
- Start orca without srun in your jobscript. Otherwise, your job will hang on the first parallelisation branch.
For reference, here is an example job script:
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK export CRAY_CUDA_MPS=1 module purge ulimit -s unlimited # fix hardcoded mpirun calls export PATH=/users/USERNAME/bin/:$PATH # openmpi dependencies of orca export LD_LIBRARY_PATH=/scratch/openmpi/build/lib:$LD_LIBRARY_PATH /scratch/USERNAME/orca_4_0_1_2_linux_x86-64_openmpi202/orca run.inp > run.log