Compute Canada

How do I submit parallel Gaussian jobs?

Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.

If you want to run Gaussian on several processors (which is encouraged, since this is a multi-processor machine) on HPCVL, you will have to include a line

%Nproc=number_of_processors

where number_of_processors is exactly what it says, in your input file for the job you are running (see below).

For production jobs, especially involving multiple processes, you need to submit a Gaussian job script to our load-balancing software Grid Engine (see our SGE FAQ for details). This is mandatory. This script requires that the environment was previously set up properly, and that you are a g98 group member.

You supply the directory in which you work, the email to which you want to notified, and the name of the input/output files by editing the script. Note that you have to replace entries that are enclosed in in the script.

The script (let's call it g09.sh) is submitted by the qsub command:

qsub g09.sh

This must be done from the working directory, i.e. the directory that contains the input file and is supposed to contain the output.

Gaussian offers the opportunity to execute major portions of the code on multiple processors. For SUN computers, this feature is implemented through shared-memory programming. The %Nprocs line in the input file causes Gaussian to use up to number_of_processors CPU's in the calculation. However, it is not acceptable to start a job like this on the standard serial GridEngine queue for production jobs. If you submit a Gaussian parallel job, your SGE script must include a line

#$ -pe gaussian.pe number of processes
. /opt/gaussian/setup.sh

where the entry enclosed in stands for the number of processors requested, and has to be identical with the number appearing in the input file. Using this will assure that the GridEngine knows how many processors are used, and will allocate resources accordingly, and your parallel job will scale reasonably well since you work on dedicated processors without oversubscription.

Note that we are using a special parallel environment gaussian.pe for Gaussian submissions. This will schedule all Gaussian jobs to a dedicated node. The second line that sources in "setup.sh" redirects I/O from/to scratch files to a fast local disk. This greatly increases Gaussian performance in some cases and automatically removes scratch file when they are not needed anymore.

Important: Please do not use any PE other than gaussian.pe for Gaussian job submissions and make sure you include the "setup.sh" line.

There is an easier way to do this: We are supplying a small perl script called GaussSubmit that can be called directly, and will ask a few basic questions, such as the name for the job to be submitted and the number of processes to be used in the job. Simply type

GaussSubmit

and answer the questions. The script expects a Gaussian input file with "file extension" .g09 to be present and will do everything else automatically. This is meant for simple Gaussian job submissions. More complex job submissions are better done manually.