The nodes of computer clusters such as our Victoria-Falls cluster are often multi-core and each core may support multiple threads. This suggests a two-layer approach when programming new codes or adapting existing ones. To optimally exploit the distributed-memory structure of the cluster while at the same time keeping each shared-memory node busy, a combined MPI-OpenMP programming approach may be taken.
Here we present sample-code that implements two-layered master-slaves models that use MPI to communicate between a master node and multiple slave nodes, while additionally allocating work dynamically within the nodes using OpenMP. The current version of the code is written in C. In this example, the workload consists of the prime factorization of integers.
In all of these models, the master MPI process M bundles basic jobs into job groups and sends these to the slave processes S for execution. When any slave process runs out of work it sends a message to M to obtain more work. Each of the slave processes further spawns a number of OpenMP threads to help with working through the job groups. This is done in a dynamic manner, i.e. slave-threads work on one job from the current job group at a time and acquire more when they are done. When they run out of jobs in the group, one of them asks the master for another group through MPI. Six different model variations were implemented:
Note that the MPI communication in these models has to be protected by a critical region to prevent a race condition on shared communication resources and messages. Therefore any communication event is between one thread on the master and one thread on the slave, albeit a potentially different one each time. MPI communication between different threads on the same process does not take place. The sample code can be downloaded by clicking here. It includes source code, an input file, and a script for running it on our systems. Alterations to the script may be necessary. The code works through all six model variations and reports success or failure. We make this available to our users (and anyone else) under the condition that using parts of the code will be acknowledged. We hope that this code supplies you with a framework to adapt for your own programs, particularly if you plan to deploy them on a cluster with multi-core/multi-threaded nodes.