Compute Canada

What are the Parallel Environments available under HPCVL Grid Engine?

Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.

A Parallel Environment is a programming environment designed for parallel computing in a network of computers, which allows execution of shared memory and distributed memory parallel applications. The most commonly used parallel environments are Message Passing Interface (MPI) for distributed-memory machines, and OpenMP for shared-memory achines.

For MPI there is an implementation called HPC ClusterTools. It's located in the /opt/SUNWhpc directory, (check the HPCVL Parallel Programming FAQ for more details)

For OpenMP, no separate runtime environment is required. Details about shared-memory programming and multi-threading with OpenMP may be found in the HPCVL Parallel Programming FAQ.

Grid Engine provides an interface to handle parallel jobs running on the top of these parallel environments. For the users convenience HPCVL has predefined parallel environment interfaces for them. These are:

  • dist.pe: This environment is intended for distributed memory applications using the Sun HPC ClusterTools libraries, in particular MPI. Grid Engine will assign the strong>dist.pe jobs to the production.q queue and try to use fastest connection available between the slots and nodes. Although the system will try to allocate processes on as few nodes as possible, it will be allowed to spread them out over the cluster, since this parallel environment is meant to handle distributed-memory jobs.
  • shm.pe: This environment is intended for shared-memory applications. Grid Engine will assign the processors in a single node to take advantage of the fastest connection available between the slots. shm.pe jobs are submitted to the production.q queue, i.e. to nodes m9k000[1-8]. It is permissible to use shm.pe for distributed-memory (e.g. MPI) jobs, if the intention is to keep them within a single node. Note that this might speed up communication, but also lead to longer waiting periods.
  • vfdist.pe: This environment serves the same purpose as dist.pe, but is designed for the Victoria Falls cluster, and restricts the scheduling of processes to a 40-node sub-cluster that is internally connected through 10 Gig Ethernet.
  • abaqus.pe, fluent.pe, matlab.pe: These are specialized environments that are used for parallel runs for the application software packages Abaqus, Fluent, and Matlab, respectively. These applications need their own parallel environments to keep track of available licenses, and to run auxillary commands.