Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.
This is a short FAQ on using the Finite-Element Analysis (FEA) code "Abaqus" on HPCVL machines. This software is only licensed for academic researchers who work at a university that is already covered by an Abaqus license. The software is only made available to persons who belong to a specific Unix group. See details below.
The ABAQUS suite of software for finite element analysis (FEA) has the ability to solve a wide variety of simulations. The ABAQUS suite consists of three core products - ABAQUS/Standard, ABAQUS/Explicit and ABAQUS/CAE.
ABAQUS/Standard is designed to solve traditional implicit finite element analyses, such as static, dynamics, and thermal. It is equipped with a wide range of contact and nonlinear material options. ABAQUS/Standard also has optional add-on and interface products, as well as integration with third party software.
ABAQUS/Explicit is focused on transient dynamics and quasi-static analyses using an explicit approach appropriate in many applications such as drop test, crushing and many manufacturing processes.
ABAQUS/CAE provides a modeling and visualization environment for ABAQUS analysis products. It offers access to CAD models, advanced meshing and visualization, and an exclusive view towards ABAQUS analysis products. ABAQUS/CAE is used mainly for pre- and post-processing. Note that this part of the ABAQUS software is not running on HPCVL machines, but on a user client machine.
The present version of Abaqus is 6.5. The programs in the Abaqus package are located in the directory /opt/abaqus-6.5
Note that only the ABAQUS/Standard and ABAQUS/Explicit packages are installed on the HPCVL clusters. This is because these components are the "number crunching" elements of Abaqus, whereas the ABAQUS/CAE component is used interactively for pre- and post-processing. The latter runs only on PC systems under Windows or Linux.
It is a good idea to include the directory/opt/abaqus-6.5/Commands in your PATH or set an environment variable before using Abaqus:
setenv ABAQUS_HOME /opt/abaqus-6.5/Commands (for csh)
export ABAQUS_HOME=/opt/abaqus-6.5/Commands (for bash)
If the directory was included in the PATH, the software is called by simply invoking "abaqus"; if the environment variable was set, the call would be to "$ABAQUS_HOME/abaqus".
Alternatively, you can take advantage of theusepackagefacility on our cluster, and simply type
from the command prompt or include that line in your .login or .bash_profile setup file.
To use Abaqus on the HPCVL machines, you have to be covered by an academic Abaqus licenseoutsideof HPCVL, i.e. you have to be a "licensed University User of Abaqus". It is furthermore required that you read our licensing agreement, and sign a statement. Note that our license does count as a license for Queen's University. We will confirm your statement, and you will then be made a member of a Unix group"abaqus", which enables you to run the software. Contact us if you are in doubt of whether you will be able to run Abaqus on our system. We also will submit your name and affiliation to Abaqus Inc. for a check if a prior university license exists.
If you need to use the pre- and post-processing software ABAQUS/CAE, you have two options. Either you use your local university license to install and use the software on your PC. Or, you contact us and we make our license available to you on a single machine with a fixed IP address only. Note that the number of simultaneous CAE sessions supported by our license is presently limited to 3 (three).
Our Abaqus license is "seat limited" and "process limited". The licensing scheme utilizes so-called "tokens". At present, there are 150 tokens available. A single-process run of Abaqus (Standard or Analysis) uses 5 tokens, multiple process-runs use more according to the formula:
Tokens = Int (5 * Processes^0.422)
To check how many license tokens are available, you can use the following command:
/opt/abaqus-6.5/License/lmstat -a -c /opt/abaqus-6.5/License/license.dat
which will tell you how many of the 150 tokens are presently in use.
The following instructions assume that you are a member of the Unix group "abaqus". They pertain only to the Standard and Explicit components of the software. The instructions in this section are only useful if you want to run a test job of Abaqus on the login node sfnode0. If you want to run a production job, please refer to to instructions on how to start a Abaqus batch job (see next section).
The Abaqus program uses a sophisticated syntax to set up a job run. Instructions to the program are written into an input file which is specified when the program is evoked. While an input file can be written "from scratch", it is also possible to use the ABAQUS/CAE component to generate such a file. Both techniques a outside the scope of this FAQ. You also can have a look at a simple example input file here . Documentation for Abaqus is extensive, and available both electronically and in print. There is no substitute to consulting it.
Assuming that we have an input file called testsys.inp, we can initiate a run (using enivronment variable ABAQUS_HOME:
$ABAQUS_HOME/abaqus job=test001 inp=testsys.inp scratch=/scratch/hpcXXXX
The job= option specified what the output files are to be called. They have various different "filename extensions" but share the name specified here (in our case test001). With the inp= option, we specify which input file to use. There are more options, such as cpus= and mp_mode= for running parallel jobs, but the two used above should get a simple serial job running.
Note that the above sequence starts the job in the background, i.e. after an initial setup phase, your terminal returns although the job is still running. If you want to avoid this, you can include the interactive option in the command line.
Note that the Abaqus software uses a directory in /tmp (which is local to the nodes on which the software is executing) as scratch space. This is the default setting and causes some Abaqus jobs to fail. It must therefore be changed to the standard scratch space /scratch/hpcXXXX (XXXX being the numbers in your userid). This can be done in one of two ways:
The second option is probably preferable. Note that the scratch directory has to be creaed manually, for instance for user hpc1005, type:
chmod 700 /scratch/hpc1005
Also, do not forget to occasionally check the contents of this scratch directory by typing (sticking with the hpc1005 example):
ls -lt /scratch/hpc1005
and removing any files that might be left over from old Abaqus runs. This is necessary because Abaqus will not remove these files if a job was terminated before it ran to completion.
The abaqus_v6.env file in the home directory can also be used to "fix" a memory problem that sometimes arises when large jobs are run. If the .msg file of an Abaqus run shows errors because of not enough "Standard Memory", you can reset this by including the line
(including the quotes) to reset it to 512 MB. The default is 256 MB.
More about changing the Abaqus environment may be learned from the "Installation and Licensing Guide"(chapter 4) of the Abaqus documentation. Please contact us if you need assistance.
In most cases, you will run Abaqus on the HPCVL machines in batch mode. Since you have to have access to Abaqus outside of the HPCVL license, most interactive work can be done elsewhere, whereas the computationally intensive runs can be executed on the cluster.
Production jobs are submitted on the systems via the GridEngine, which is a load-balancing software. To obtain details, read our Gridengine FAQ. For a Abaqus batch job, this means that rather than issuing the above commands directly, you wrap it into a GridEngine batch script. This script needs to be altered by replacing all the relevant items enclosed in by the right values. The interactive option is necessary; without it the program will not start properly. The script can be submitted to the GridEngine by typing, e.g.
Note that Abaqus needs to be set up correctly before submitting this script, as it inherents the settings of the submitting shell.
The advantage to submit jobs via a load balancing software is that the software will automatically find the resources required and put the job onto a set of processors that have a low load. This will help executing the job faster. Note that the usage of Gridengine for all production jobs on the HPCVL clusters is mandatory.Production jobs with a running time of more than 3 hours that are submitted outside of the load balancing software will be terminated by the system administrator.
The Abaqus jobs that you will want to run on the HPCVL machines are likely to be quite large. To utilize the parallel structure of a cluster such as ours, Abaqus offers several options to execute the solver in a parallel environment, i.e. on several CPU's simultaneously.
HPCVL clusters consist of several interconnected nodes, each of which is a shared-memory machine with up to 512 cores or processors. The cluster is able to execute both distributed-memory parallel programs (usually employing MPI), and shared-memory (multi-threaded) programs. The Abaqus software achieves a certain degree of parallel scaling using both of these methods. The parallel portions of Abaqus are restricted to the solver and operations on the elements. Here is a list of operations with the corresponding parallel mode that Abaqus supports:
Element operations - MPI only
Iterative solver - MPI or threads
Direct solver - Threads only
Lanczos solver - Threads only
Note that at present only the shared-memory parallelism is in use on our clusters. It is necessary to decide before a parallel Abaqus run which parallel mode (if any) is to be used (on our clusters, use "threads"), and how many processes are to be started.
Production jobs on the HPCVL Clusters must be submitted via the Grid Engine scheduling software. Since most parallel Abaqus jobs fall into this category, we have made a sample script for Gridengine submission. Note that Grid Engine allocates all processors on a single node.
Processes are not the only resources that need to be allocated when a parallel Abaqus job is submitted. Since the Abaqus license is limited, a scheme must be applied that determines if there are still enough license tokens available. Therefore a special parallel environment abaqus.pe is used. This is expressed in the "#$ -pe" line in the above sample scripts. Note that the following limitations apply for Abaqus production jobs:
This is to ensure fair access to the limited number of tokens and to avoid shared-memory problems that occur on some nodes if too many processes are used for a single Abaqus job.
Grid Engine is able to interact with the Abaqus license manager to check if sufficient licenses are available for running. This will keep the scheduler from starting jobs because enough processors are available, just to be stopped again because there are not enough licenses. Grid Engine keeps an internal counter of available "token slots" which gets updated frequently. Everytime Grid Engine attempts to schedule an Abaqus job and is kept from doing so because not enough licenses are available, it will "requeue" the job. Since this causes the issue of an email if the email notification line (#$ -m) is present, this line should be omitted. Instead, Grid Engine was configured to send notification at the beginning and end of job execution, whenever the email definition line (#$ -M) is present. Therefore, if you want to be notified include the #$ -M, otherwise omit it. Do not include the #$ -m line because it floods your email with notifications.
After altering the script by substuting the items enclosed in , it in can be submitted to the Gridengine by
from sfnode0 (which is the GridEngine submit host). Note that the job will appear as a parallel job on the GridEngine's qstat or qmon. Note also that submission of a parallel job in this way is only profitable for large systems that use many CPU cycles, since the overhead for assigning processes, preparing nodes, and communication between them is considerable.
HPCVL supplies a small cluster of AMD-Opteron based Linux machines to support Abaqus versions higher than 6.5. This "mini-cluster" presently consists of 12 nodes with a total of 140 cores each, running Abaqus 6.7, 6.9, and 6.10. This cluster is only to be used if the newer Abaqus version is necessary. Because of the limited number of cores per node, only 8-12 processor jobs can be run on it. Also, the total memory of each node is more limited than on the main M9000 servers, meaning that jobs with very large memory requirements cannot be submitted to this cluster.
To submit a job to the mini-cluster, a modified submission script must be used. The only difference in the script is that a special Abaqus queue is used for the mini-cluster. For this the lines
#$ -q abaqus.q
have been inserted at the top. It is also important to use a different "usepackage" setup line to request the correct version of Abaqus (for instance, 6.7):
before submitting the script. If the standard Abaqus 6.5 setup is used, jobs submitted to the mini-cluster will fail. Likewise, standard jobs submitted to the default clusters will fail if Abaqus was set up to run version 6.7.
Abaqus is a very complex software package, and requires some practice to be used efficiently. In this FAQ we can not explain it use in any detail. Online documentation for the programs is available on machines where Abaqus is installed. On the login node, it can be accessed by a webbrowser under
Note that you have to start the browser on the login node (use firefox) because we do not have a webserver running on the cluster (for security reasons). A pdf version of the Abaqus documentation can be found in
If you have problems with the GridEngine, read our FAQ on that subject, and maybe consult the manual for that software which is accessible as a PDF file. HPCVL also provide user support in the case of technical problems. Contact us here, we might be able to help, or pass you on to someone who can.