**Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.**

The working principle of MPI is perhaps best illustrated on the grounds of a programming example. The following program, written in Fortran 90 computes the sum of all square-roots of integers from 0 up to a specific limit *m*:

module mpi

include 'mpif.h'

end module mpi

module cpuids

integer::myid,totps, ierr

end module cpuids

program example02

use mpi

use cpuids

call mpiinit

call demo02

call mpi_finalize(ierr)

stop

end

subroutine mpiinit

use mpi

use cpuids

call mpi_init( ierr )

call mpi_comm_rank(mpi_comm_world,myid,ierr)

call mpi_comm_size(mpi_comm_world,totps,ierr)

return

end

subroutine demo02

use mpi

use cpuids

integer:: m, i

real*8 :: s, mys

if(myid.eq.0) then

write(*,*)'how many terms?'

read(*,*) m

end if

call mpi_bcast(m,1,mpi_integer,0,mpi_comm_world,ierr)

mys=0.0d0

do i=myid,m,totps

mys=mys+dsqrt(dfloat(i))

end do

write(*,*)'rank:', myid,'mys=',mys, ' m:',m

s=0.0d0

call mpi_reduce(mys,s,1,mpi_real8,mpi_sum,0,mpi_comm_world,ierr)

if(myid.eq.0) then

write(*,*)'total sum: ', s

end if

return

end

Some of the common tasks that need to be performed in every MPI program are done in the subroutine *mpiinit* in this program. Namely, we need to call the routine *mpi_init*to prepare the usage of MPI. This has to be done before any other MPI routine is called. The two routine calls to *mpi_comm_size* and *call mpi_comm_rank* determine how many processes are running and what is the unique ID number of the present, i.e. the calling process. Both pieces of information are essential. The results are stored in the variables *totps* and *myid*, respectively. Note that these variables appear in a module *cpuids* so that they may be accessed from all routines that "*use*" that module.

The main work in the example is done in the subroutine *demo02*. Note that this routine does use the module *cpuids*. The first operation is to determine the maximum integer*m* in the sum by requesting input from the user. The *if*-clause "if(myid.eq.0) then" serves to restrict this I/O operation to only one process, the so-called "root process", usually chosen to be the one with *rank* (i.e. unique ID number) zero.

After this initial operation, communication has become necessary, since only one process has the right value of *m*. This is done by a call to the MPI collective operation routine *mpi_bcast*. This call has the effect of "broadcasting" the integer *m*. This call needs to be made by *all* processes, and after they have done so, all of them know *m*.

The sum over the square root is then executed on each process in a slightly different manner. Each term is added to a local variable *mys*. A stride of *totps* (the number of processes) in the do-loop ensures that each process adds different terms to its local sum, by skipping all others. For instance, if there are 10 processes, process 0 will add the square-roots of 0,10,20,30,..., while process 7 will add the square-roots of 7,17,27,37,...

After the sums have been completed, further communication is necessary, since each process only has computed a partial, local sum. We need to collect these local sums into one total, and we do so by calling *mpi_reduce*. The effect of this call is to "reduce" a value local to each process to a variable that is local to only one process, usually the*root process*. We can do this in various ways, but in our case we choose to sum the values up by specifying *mpi_sum* in the function call. Afterwards, the total sum resides in the variable *s*, which is printed out by the root process.

The last operation done in our example is finalizing MPI usage by a call to*mpi_finalize*, which is necessary for proper program completion.

In this simple example, we have distributed the tasks of computing many square roots among processes, each of which only did a part of the work. We used communication to exchange information about the tasks that needed to be performed, and to collect results. This mode of programming is called "task parallel". Often it is necessary to distribute large amounts of data among processes as well, leading to "data parallel" programs. Of course, the distinction is not always clear.

- © HPCVL 2014
- Last updated on Friday November 25, 2011 at 12:04 pm
- Sitemap