Compute Canada

How do I setup and execute a "Abaqus" parallel batch job?

Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.

The Abaqus jobs that you will want to run on the HPCVL machines are likely to be quite large. To utilize the parallel structure of a cluster such as ours, Abaqus offers to execute the solver in a parallel environment, i.e. on several CPU's simultaneously. The recent versions of Abaqus run only on a Linux platform, restricting the execution of Abaqus jobs to our "SW cluster".

The Abaqus software achieves a certain degree of parallel scaling using either shared- or distributed memory machines. Here is a list of operations with the corresponding parallel mode that Abaqus supports:

Element operations - MPI only 
Iterative solver - MPI or threads
Direct solver - Threads only
Lanczos solver - Threads only

Note that only the shared-memory parallelism is in use on our clusters. It is necessary to decide before a parallel Abaqus run which parallel mode (if any) is to be used (on our clusters, use "threads"), and how many processes are to be started.

Production jobs on the HPCVL Clusters must be submitted via the Grid Engine scheduling software. Since most parallel Abaqus jobs fall into this category, here is an example script for a parallel submission:

#!/bin/bash
#$ -S /bin/bash
#$ -q abaqus.q
#$ -l qname=abaqus.q
#$ -V
#$ -cwd
#$ -m be
#$ -M {email}
#$ -o {output}
#$ -e {error}
#$ -pe shm.pe {number of procs}
export PATH="/opt/abaqus/6.11/Commands":$PATH
export PATH="/bin":$PATH
source /opt/ics/bin/compilervars.sh intel64
abaqus job={name} input={input} scratch=/scratch/{username} cpus=$NSLOTS mp_mode=threads -interactive

Note that Grid Engine allocates all processors on a single node. Processes are not the only resources that need to be allocated when a parallel Abaqus job is submitted. Since the Abaqus license is limited, the following limitations apply for Abaqus production jobs:

  • Up to two Abaqus job per user can be executed at any time.
  • A parallel Abaqus job must use no more than 20 processes.

After altering the script by substuting the items enclosed in , it in can be submitted to the Gridengine by

qsub batch_file_name

from sfnode0 (which is the GridEngine submit host). Note that the job will appear as a parallel job on the GridEngine's qstat or qmon. Note also that submission of a parallel job in this way is only profitable for large systems that use many CPU cycles, since the overhead for assigning processes, preparing nodes, and communication between them is considerable.