Compute Canada

ADF

Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.

This is a short FAQ on how to use the computational Quantum Chemistry code "ADF" on HPCVL clusters. HPCVL requires all users of this software to sign a statement. The software is available on all nodes of the cluster, but can only be accessed by persons who belong to a specific Unix group. See details below. Note that this FAQ talks about the 2009.1 version of ADF, which is different from previous versions in that it uses MPI (instead of PVM) for parallel runs. If you haven't used ADF for a while, you might find re-reading this FAQ worthwhile.

What is ADF?

ADF stands for "Amsterdam Density Functional" and denotes a package of programs that uses Density Functional Theory (DFT) for electronic and molecular structure calculations. The package is geared towards Chemists and Physicists with an interest in the structure of molecules and solids.

The ADF package consists of two main components:

  • ADF for molecular calculations
  • BAND for calculations on solids

Unlike most other molecular/solid/electronic structure codes, ADF employs "Slater-type" basis sets, ie, functions that have an exponential behavior, which are more suitable for the description of chemical systems than the more commonly employed "Gaussian type" ones. The downside of this are computational difficulties that may be circumvented by numerical integration. Since DFT depends largely on numerical integration anyhow, the "Slater approach" is particularly well-suited for DFT code.

ADF is arguably the best DFT code available at this time for transition metal compounds and solids.

ADF handles geometry optimizations, transition states, reaction paths, and infrared frequencies. It allows the calculation of a variety of properties, ranging from UV spectra (requiring the treatment of excited states) to NMR chemical shifts and spin-spin couplings (where the use of Slater-type bases is of great use). The BAND code can be used for calculations on polymers, surfaces and bulk solids.

Where is ADF located and how do I access it?

The present version of ADF is 2009.1. The programs in the ADF package reside in /opt/adf/bin. To use ADF on HPCVL machines, it is required that you read our licensing agreement and sign a statement. You will then be made a member of a Unix group adf, which enables you to run the software.

How do I run ADF interactively?

The following instructions assume that you are a member of the Unix group adf. The instructions in this section are only useful if you need to run test jobs of a short duration. If you want to run a production job, please refer to to instructions on how to start a ADF batch job.

Both package components ADF and BAND require the setting of environment variables to properly function.

(bash syntax):

export ADFHOME=/opt/adf 
export ADFBIN=$ADFHOME/bin
export ADFRESOURCES=$ADFHOME/atomicdata
export SCMLICENSE=$ADFHOME/license
export LD_LIBRARY_PATH="/opt/studio12/SUNWspro/lib":$LD_LIBRARY_PATH
export SCM_TMPDIR=$SCRATCHDIR export SCM_USETMPDIR="yes"
export NSCM=8 export MPIDIR="/opt/SUNWhpc/HPC7.0"

The environment variables ADFHOME, ADFBIN and ADFRESOURCES are necessary for proper program execution and are used for the system to find executables and data files such as basis sets. SCMLICENSE is used by the license manager of the program to find a machine specific file which ensures that the code cannot be run outside of the terms of the license agreement that exists between HPCVL and the software provider SCM. The variable NSCM gives the default number of processors for a program run. The actual number of processors the software uses will be set explicitly in the case of a parallel run.

Fortunately, the above settings can be applied through a call to usepackage on our system. Issuing the command

use adf

will take care of this, as well as including the $ADFBIN directory in the $PATH. This command may also be placed into a login shell setup file.

Once the environment variables are set, the program(s) can be run:

For ADF: adf <in >out 
For BAND: band <in >out

Instructions about the job are read from standard input, which has been redirected from a file in in the above command lines. Commonly an input file will be constructed to specify what calculation is to be run. The output of the program(s) goes to "standard output" and has been redirected to an output file out above. Note that the output of these programs is commonly thousands of lines long and should therefore be redirected in any case.

The construction of a proper input file for ADF is an involved process, and is outside the scope of this FAQ. Detailed instructions can be found in the ADF User's Guide or the BAND User's Guide, respectively, which should be studied in any case before the programs can be used properly. As an initial hint, we include a sample input file here. The input consists of several units, separated by blank lines, starting with a keyword, and ending with the statement END. For instance, the atoms in a molecules may be specified by issuing the keyword atoms, followed by one line with the atom name and the Cartesian coordinates for each atom, and closing with end (case insensitive).

How do I set up and run a parallel ADF batch job?

In most cases, you will run ADF in batch mode.

Production jobs are submitted to our systems via the Grid Engine, which is a load-balancing software. To obtain details, read our Grid Engine FAQ. For an ADF batch job, this means that rather than issuing the above commands directly, you wrap them into a Grid Engine batch script. For an example for such a batch script please click here .

This script needs to be altered by replacing all the relevant items enclosed in by the proper values. It will set all the necessary environment variables (make sure you issued a use adf statement before using this), and then starts the program. The lines in the script that start with #$ are interpreted by the operating system as a mere comment, but by the Grid Engine load balancing software as directives for the execution of the program.

For instance the line "#$ -m be" tells the Grid Engine to notify the user via email when the job has started and when it is finished, while the line beginning with "#$ -M" tells the Grid Engine about the email address of the user.

The lines starting with #$ -o and #$ -e determine whence the standard input and the standard error, respectively are to be redirected. Since the job is going to be executed in batch, no terminal is available as a default for these. Note that no further redirection using > is therefore necessary. All file names and directory names appearing in the script should be given in full to avoid ambiguities. One of the most common mistakes when writing Grid Engine scripts is redirecting output to inaccessible files.

The ADF package is able to execute on several processors simultaneously in a distributed-memory fashion. This means that some tasks such as the calculation of a large number of matrix elements, or numerical integrations may be done in a fraction of the time it takes to execute on a single CPU. For this, the processors on the cluster need to be able to communicate. To this end, the SUN version of ADF uses the MPI (Message Passing Interface), a well-established communication system.

 

Because ADF uses a specific version of the parallel system MPI (ClusterTools 7), executing the use adf command will also cause the system to "switch" to that version, which might have an impact on jobs that you are running from the same shell later. To undo this effect, you need to type use ct8 when you are finished using ADF and want to return to the production version of MPI (ClusterTools 8).

ADF parallel jobs that are to be submitted to Grid Engine will use the MPI parallel environment and queues already defined for the HPCVL users.

Our sample script contains a line that determines the number of parallel processes to be used by ADF. The Grid Engine will start the MPI parallel environment (PE) with a given number of slots that you specify by modifying that line:

#$ -pe dist.pe number of processes

where the number of processes requested replaces the expression in .

Once properly modified, the script can be submitted to the Grid Engine by typing

qsub batch_file_name

The advantage to submit jobs via a load balancing software is that the software will automatically find the resources required and put the job onto a node that has a low load. This will help executing the job faster. Note that the usage of Grid Engine for all production jobs on HPCVL clusters is mandatory. Production jobs that are submitted outside of the load balancing software will be terminated by the system administrator.

Luckily, there is an easier way to do all this: We are supplying a small perl script called that can be called directly, and will ask a few basic questions, such as the name for the job to be submitted and the number of processes to be used in the job. Simply type

ADFSubmit

and answer the questions. The script expects a ADF input file with "file extension" .adf to be present and will do everything else automatically. This is meant for simple ADF job submissions. More complex job submissions are better done manually.

 

In most cases, you will run ADF in batch mode.

Production jobs are submitted to our systems via the Grid Engine, which is a load-balancing software. To obtain details, read our Grid Engine FAQ. For an ADF batch job, this means that rather than issuing the above commands directly, you wrap them into a Grid Engine batch script. For an example for such a batch script please click here .

This script needs to be altered by replacing all the relevant items enclosed in {} by the proper values. It will set all the necessary environment variables (make sure you issued a use adf statement before using this), and then starts the program. The lines in the script that start with #$ are interpreted by the operating system as a mere comment, but by the Grid Engine load balancing software as directives for the execution of the program.

For instance the line "#$ -m be" tells the Grid Engine to notify the user via email when the job has started and when it is finished, while the line beginning with "#$ -M" tells the Grid Engine about the email address of the user.

The lines starting with #$ -o and #$ -e determine whence the standard input and the standard error, respectively are to be redirected. Since the job is going to be executed in batch, no terminal is available as a default for these. Note that no further redirection using > is therefore necessary. All file names and directory names appearing in the script should be given in full to avoid ambiguities. One of the most common mistakes when writing Grid Engine scripts is redirecting output to inaccessible files.

The ADF package is able to execute on several processors simultaneously in a distributed-memory fashion. This means that some tasks such as the calculation of a large number of matrix elements, or numerical integrations may be done in a fraction of the time it takes to execute on a single CPU. For this, the processors on the cluster need to be able to communicate. To this end, the SUN version of ADF uses the MPI (Message Passing Interface), a well-established communication system.

 

Because ADF uses a specific version of the parallel system MPI (ClusterTools 7), executing the use adf command will also cause the system to "switch" to that version, which might have an impact on jobs that you are running from the same shell later. To undo this effect, you need to type use ct8 when you are finished using ADF and want to return to the production version of MPI (ClusterTools 8).

ADF parallel jobs that are to be submitted to Grid Engine will use the MPI parallel environment and queues already defined for the HPCVL users.

Our sample script contains a line that determines the number of parallel processes to be used by ADF. The Grid Engine will start the MPI parallel environment (PE) with a given number of slots that you specify by modifying that line:

#$ -pe dist.pe {number of processes}

where the number of processes requested replaces the expression in {}.

Once properly modified, the script can be submitted to the Grid Engine by typing

qsub batch_file_name

The advantage to submit jobs via a load balancing software is that the software will automatically find the resources required and put the job onto a node that has a low load. This will help executing the job faster. Note that the usage of Grid Engine for all production jobs on HPCVL clusters is mandatory. Production jobs that are submitted outside of the load balancing software will be terminated by the system administrator.

Luckily, there is an easier way to do all this: We are supplying a small perl script called that can be called directly, and will ask a few basic questions, such as the name for the job to be submitted and the number of processes to be used in the job. Simply type

ADFSubmit

and answer the questions. The script expects a ADF input file with "file extension" .adf to be present and will do everything else automatically. This is meant for simple ADF job submissions. More complex job submissions are better done manually.

Where can I get further help?

ADF is a complex software package, and requires some practice to be used efficiently. In this FAQ we can not explain its use in detail. A User's Guides for ADF and BAND can be downloaded here. The software provider SCM operates a very informative website with lots of information, including examples, manuals, FAQ's, etc. There is also a User Email Group, and we encourage people who use the software regularly to join. If you have problems with the Grid Engine, read our Grid Engine FAQ on that subject, and maybe consult the manual for that software which is accessible as a PDF file. HPCVL also provides user support in the case of technical problems. Contact us here, we might be able to help, or pass you on to someone who can.