OMPC PLASMA

This library is an extension of the PLASMA library for distributed memory systems.

Building

To use OMPC PLASMA, we provide a docker image ompcluster/plasma-dev:lastest containing a pre-compiled Clang/LLVM with all the OpenMP and MPI libraries needed to compile and run OMPC PLASMA.

You can execute OMPC PLASMA on any computer using docker or Singularity.

To install OMPC PLASMA we use the following commands:

git clone https://gitlab.com/ompcluster/plasma.git
cd plasma/
mkdir build
cd build
export CC=clang 
export CXX=clang++
export OpenBLAS_ROOT=/usr/local/include/openblas/
cmake ..
make -j$(nproc)

Usage

OMPC PLASMA should be run using parameters. To observe the parameters, execute the following command:

./plasmatest --help

In general, OMPC PLASMA should be executed with the following parameters:

./plasmatest routine --dim=$dim --nrhs=$dim --nb=$nb --test=$test 

These parameters represent:

  • routine: This parameter represents the application of linear algebra. Currently OMPC PLASMA supports four applications: spotrf, sgemm, ssyrk and strsm.

  • $dim: The matrix size.

  • $nb: The block size. This number should be divisor of $dim.

  • $test(y|n): Determine whether or not the results should be verified.

There are other parameters, which depend on each routine that the user wants to execute.

Example

Here is a example how to run the OMPC PLASMA in a cluster using SLURM:

#!/bin/bash
#SBATCH --job-name=plasma-job
#SBATCH --output=plasma-output.txt
#SBATCH --nodes 3

module purge
module load mpich/4.0.2-ucx

##### OMPC settings
export OMPCLUSTER_NUM_EXEC_EVENT_HANDLERS=4
expsort LIBOMP_NUM_HIDDEN_HELPER_THREADS=8
export OMPCLUSTER_HEFT_COMM_COEF=0.00000000008
export OMPCLUSTER_HEFT_COMP_COST=20000000000

##### OpenMP settings
export OMP_NUM_THREADS=4
export OPENBLAS_NUM_THREADS=1 

srun --mpi=pmi2 -n 3 singularity exec plasma-dev_latest.sif plasma/build/plasmatest spotrf --dim=1024 --nrhs=1024 --nb=256 --test=y

OMPC configurations depend on how the user executes the program in the cluster. In the example, OMPC PLASMA runs on 2 worker nodes, each node will work with 4 threads.