Compute Canada

How is OpenMP used?

Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.

OpenMP is usually used in the stepwise parallelization of pre-existing serial programs. Shared-memory parallelism is often called "loop parallelism" because of the typical situation that make OpenMP compiler directives an option.

The OpenMP compiler directives are inserted into the serial code by the user. They instruct the compiler to distribute the tasks performed in a certain region of the code (usually a loop) over several sub-processes, which in turn may be executing on different CPUs.

For instance, the following Fortran loop looks as if the repeated calls to the functionpoint() could be done in seperate processes, or better on seperate CPUs:

 do imesh=inz,nnn,nstep 
svec(1)=xmesh(imesh)
svec(2)=ymesh(imesh)
svec(3)=zmesh(imesh)
integral=integral+wints(imesh)*point(svec)
end do

If we are using a compiler that is able to automatically parallelize code, and try to use that feature, we will find that things are not that simple. The function call topoint may hide a "loop dependency", i.e. a situation where data computed in one loop iteration depend on data calculated in another. The compiler will therefore commonly reject parallelizing such a loop as "unsafe".

The use of OpenMP directives can solve this problem:

 !$omp parallel do private (imesh,svec) & 
!$omp shared (inz,nnn,nstep,xmesh,ymesh,zmesh,wints) &
!$omp reduction(+:integral)
do imesh=inz,nnn,nstep
svec(1)=xmesh(imesh)
svec(2)=ymesh(imesh)
svec(3)=zmesh(imesh)
integral=integral+wints(imesh)*point(svec)
end do
!$omp end parallel do

The three lines of directives have the effect of forcing the compiler to distribute the tasks performed in each of the loop iterations over seperate, dynamically created processes. Furthermore, they inform the compiler which variables can be used by all sub-processes (ie, shared), and which have different values for each process (ie, private). Finally, they direct the compiler to collect values of integral sperately in each process and then "reduce" them to a common value by summing them up.

OpenMP programs need to be compiled with special compiler options and will then yield parallel code. It must be pointed out that since the compiler is forced to multi-thread specific regions of the code, it is the responsibility of the programmer to ensure that such multi-threading is safe, i.e. no dependeny between iterations in the parallelized loop exist. In the above example that means that the tasks performed inside the point call are indeed independent.