Basic Usage

Compile and run programs

Compiling OpenMP code for OmpCluster requires to use a specific OpenMP target x86_64-pc-linux-gnu which indicates to the compiler to compile the OpenMP target code region for a device. For example, the mat-mul example can be compiled using the following command:

clang -fopenmp -fopenmp-targets=x86_64-pc-linux-gnu mat-mul.cpp -o mat-mul

Then, you can run the newly created program but, contrary to classical OpenMP programs, programs need to be executed using mpirun or mpiexec tools to use the OmpCluster distributed task runtime just as any other MPI program:

mpirun -np 3 ./mat-mul

In this case, the runtime will automatically create 3 MPI processes (one head and two workers): the head process will offload OpenMP target regions to be executed on the workers following the currently implemented scheduling strategy.

The runtime also supports the offloading to remote MPI processes (located on other computers or containers), those can be configured using the -host or -hostfile flags of mpirun (note that flags may differ between MPI implementations). However, just as any MPI programs, the user needs to copy the binary on all computers/containers before executing it (using pdcp command or NFS directory).

As you might have noticed, it is somehow hard to follow what the OmpCluster runtime is doing when executing the application. You can enable information messages from the runtime by setting the following command when running the program:

LIBOMPTARGET_INFO=-1 mpirun -np 3 ./mat-mul

Container

To easily experiment with the OmpCluster tools, we provide a set of docker images containing a pre-compiled Clang/LLVM with all necessary OpenMP and MPI libraries to compile and run any OmpCluster programs.

All our images are based on Ubuntu 20.04. However, different configurations are available with different versions of CUDA, and MPICH or OpenMPI. Choose the docker image tag according to your favorite configuration or latest to use the default configuration.

You can execute your applications with the OmpCluster runtime on any computer using docker and the following command:

docker run -v /path/to/my_program/:/root/my_program -it ompcluster/runtime:latest /bin/bash
cd /root/my_program/

This flags -v is used to share a folder between the operating system of the host and the container. You can get more information on how to use Docker in the official Get Started guide).

You can also use Singularity, using the following commands for example:

singularity pull docker://ompcluster/runtime:latest
singularity shell ./runtime_latest.sif
cd /path/to/my_program/

See here for more information about Singularity. Note that some cluster environments may have the new Singularity version that is called Apptainer.

Cluster job manager execution

The OmpCluster runtime can be used with a cluster job manager, like Slurm. After compiling your code using a container, you can launch the job as any MPI program. For example, using Slurm:

srun -N 3 --mpi=pmi2 singularity exec ./runtime_latest.sif ./my_program

Every job manager suport many configurations. For example, you can refere to the Slurm documentation.

Existing images and configurations

The container images that we provide follows this naming convention: ompcluster/<image_name>:<tag>.

Several images are available on our docker-hub repository, here is a tentative to list them:

  • hpcbase: the base image for all other containers. It contains the MPI implementation, CUDA, Mellanox drivers, etc.

  • runtime: this image contains the pre-built Clang and OmpCluster runtime based on the stable releases.

  • runtime-dev: this image contains the pre-built Clang and OmpCluster runtime based on the Git repository. This version should be considered as unstable and should not be used in production.

  • Application-specific images (awave-dev, beso-dev, plasma-dev, etc): Those images are based on the runtime image but contains additional libraries and tools required to develop some applications.