Compute Canada

Compilers at HPCVL

Please note: The FAQ pages at the HPCVL website are continuously being revised. Some pages might pertain to an older configuration of the system. Please let us know if you encounter problems or inaccuracies, and we will correct the entries.

This is an introduction to the native Fortran, C, and C++ compilers used on the Solaris/Sparc platform on HPCVL clusters and servers. It is meant to give the user a basic idea about the usage of the compilers and about code optimization options. The document is organized in an "FAQ" manner, i.e. a list of "obvious" questions is presented as a guideline. Please feel free to contact us if you want to see more questions included.

Where are the Fortran and C/C++ compilers located?

The Fortran and C++ compilers and the needed headers, libraries and tools can be found in the /opt/SUNWspro subdirectory system. The current version is Studio 12 update 1. The compilers for F77, F90, F95, C and C++, together with a development tool called "sunstudio" are in /opt/SUNWspro/bin. Various libraries are in /opt/SUNWspro/lib. This includes dynamic ones, so if your program complains about not finding "mickey_mouse.so", setting

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/SUNWspro/lib 

might be a good idea. There is a lot of other stuff under this subdirectory, including online-documentation, so you can get help by pointing your web browser on the SunFire login node to

file:///opt/SUNWspro/docs/index.html

Which environment variables do I have to set, what does my path have to look like if I want to do program development?

For the most part, you do not have to add anything to your default setup to use program development tools such as compilers and debuggers. We are using a program calledusepackage which replaces the issuing of lengthy setting for environment variables by a simple command use. By default, you start with standard-user-settings that include the latest compilers and development tools. If you want to change this, you can do so by issuing the command

use package

where package stands for one of the following:

ct6 - Sun ClusterTools 6 
ct7 - Sun ClusterTools 7.1 
ct8 - Sun CLusterTools 8.1
studio12 - Sun Studio 12 Compilers and Tools
studio11 - Sun Studio 11 Compilers and Tools
studio10 - Sun Studio 10 Compilers and Tools
studio8 - Sun Studio 8 Compilers and Tools
studio7 - Sun Studio 7 Compilers and Tools
workshop6 - Sun Workshop 6 Compilers and Tools

Note that usepackage is prepending the compiler directories to your PATH variable. You can do this manually, of course. For bash you use the following command:

 export PATH=/opt/SUNWspro/bin:$PATH export MANPATH=/opt/SUNWspro/man:$MANPATH

This can go into your setup file (.bash_profile, .bashrc). The first sets your search path, the second your "manual path" (if you want to use the Unix man command). With these setting you should be able to run the development tool "sunstudio" and get started editing, compiling and debugging programs.

How do I compile and link programs? Which compiler flags should I use?

Normally, you use the "Sun Studio" compilers in /opt/SUNWspro/bin to compile and link. To compile a Fortran 77, Fortran 90, Fortran 95, C, or C++ program, you issue the f77, f90, f95, cc, or CC commands, respectively. Compiling and linking is best done with a makefile. But you can also issue the commands by hand.

To compile:

 compiler -c [options] name.ext

compiler = f77, f90, f95, cc or CC; name = name of your program source file; ext =extension, i.e. f (fixed format) or f90 (free format) for Fortran (90), c for C, cpp or C for C++, etc., [options] denotes compiler flags that usually start with an '-')
Note for Fortran programmers: You are actually using the Fortran90 (f90) compiler even if you are compiling F77 programs. The f77 command issues additional compiler flags that concern compatibility.

To link:

 compiler -o name [options] [libraries] list 

(compiler see above; name name of the executable; [options] see above; [libraries]libraries that need to be linked in, usually as a list of file names with full path, or as '-L' and '-l' combinations [see below]; list list of object files, usually with .o extension)

Using the compilers and the linker in the above manner requires the proper setting of thePATH environment variable.

There are hundreds of compiler flags, and many of them are not required most of the time. A few that are in more frequent use are:

-xOn optimizes your code. nis a number from 1 to 5 with increasing severity of alterations made to the code, but also increasing gain. Up to -xO3 is generally rather safe to use. But you should, of course, always check results against an un-optimized version: they might differ.

-fast is a combination of optimization flags that is quite safe to use and often improves performance a lot. However, the resulting code is optimized specifically for the current machine architecture and cannot be executed on older SUN's (including the UltraSparc-III). Note that this overrides the -xOn option if it comes after it, since compiler options are executed from left to right! If you use this flag for compiling, you also need to include it at the linking stage.

-g produces code that can be debugged. Unlike for other compilers, -g and -xOn are not mutually exclusive, so it is a good flag to have in the development stage of a program.

-v produces more output than you can handle, which makes it easier to track down problems.

-lname is used to bind in a library called libname.a (static) or libname.so (dynamic). This flag is used to link only.

-Ldirname is used in conjunction with -lname and lets the linker know where to look for libraries. dirnameis a directory name such as /opt/studio12/SUNWspro/prod/lib.

-Rdirname is used to tell the program where to get dynamic libraries at runtime.

There are many more flags. They are documented in the man pages (man f90 or man cc), as well in various documents that may be downloaded in pdf format from the Sun documentation website. The latter is a good place to look to resolve problems in any case. Use the search engine to obtain User's Guides and Reference Manuals.

Some compiler flags are only useful for parallel programs and will be discussed later. Sometimes there is a considerable performance gain from using specific options (such as-xchip and -xtarget), but the code becomes less general.

How do I optimize for specific machines at HPCVL ?

The Studio compilers provide a variety of optimization options. Some of then are applicable to all computers with a Solaris/Sparc platform. Others are specific to a chip and/or architecture.

The most commonly used general optimization flag is the -xOn option. nis a number from 1 to 5 with increasing severity of alterations made to the code, but also increasing gain. Up to -xO3 is generally rather safe to use. Parallelization options require an optimization level of 3, and enforce that level if it is not explicitly specified.

For a specific optimization, the most popular option is -fast. This is a "macro" containing an array of optimization flags that is quite safe to use and often improves performance substantially. The optimization flags used are:

-xO5
-xarch=native
-xcache=native
-xchip=native
-xpad=local
-xvector=lib (Fortran)
-dalign (Fortran)
-xmemalign=8s (C/C++)
-fsimple=2
-fns=yes
-ftrap=common (Fortran)
-ftrap=0one (C/C++)
-xlibmil
-xlibmopt
-fround=nearest (Fortran)
-xbuiltin=%all (C/C++)
-D__MATHERR_ERRNO_DONTCARE (C/C++)
-fsingle (C)
-xalias_level=basic (C)

Some of these options apply at the compile stage, others are passed to the linker. Many of them are specific to Fortran or C/C++. The macro option -fast should be specified at both the compile and link stage if these are done separately. Details about the effect of the sub-options can be found in the man pages.

Because of the -xarch, -xcache, and -xchip options implied in -fast, the latter is specific (via the native setting) to the platform on which the compilation takes place. Usually, you will compile your code on the login node, and the resulting executables will therefore be optimized for the UltraSparc-IV+ architecture of that node.

If you wish to optimize for another type of node, you can override the -xarch, -xcache, and -xchip settings explicitely. Keep in mind that overriding happens from left to right, so if you specify -fast and add an -xarch statement to the right, this will replace the implied -xarch=native setting. For the three major Solaris/Sparc platforms we are using at HPCVL, the settings are:

  • Sunfire (US-IV+) platform, "SF 25K cluster", production.q:
    -xarch=sparcvis2 -xcache=64/32/4:2048/64/4:32768/64/4 -xchip=ultra4plus
    Note that this platform is shared by the login node and the Sunfire cluster, which means that if you compile on the login node, the explicit specification of these flags is not necessary. If needed, the environment variable SFFLAGS is set to the above options.
  • M9000 (Sparc64-VII+) platform, "M9K cluster", m9k.q:
    -xarch=sparcima -xcache=64/64/2:6144/256/12 -xchip=sparc64vii
    Because our current default cluster is of this architecture, we recommend to use these settings to override the ones from -fast if your code is used mostly on the M9000 servers. If needed, the environment variable M9KFLAGS is set to the above options.
  • Niagara-2 (UltraT2+) platform, "Victoria Falls cluster", vf.q:
    -xarch=sparcvis2 -xcache=8/16/4:4096/64/16 -xchip=ultraT2plus
    Because the "Victoria Falls" cluster is used for specific purposes, we recommend to to use these settings only if the code should be specifically optimized for this cluster. The -xarch option does not have to be specified as it is the same as on the login node. If needed, the environment variable VFFLAGS is set to the above options.

For each of these architectures, we have provided environment variables on our systems, so that specific optimization becomes easier. For instance, to optimize for the Victoria Falls cluster specifically, the VFFLAGS variable can be used:

cc -fast $(VFFLAGS) test.c

In our experience, code that is compiled on the US-IV+ login node with -fast and no additional options performs well on all three of our platforms.

How can I check out performance of my serial, multi-threaded, or MPI code

The SUN's are equipped with a powerful interface for program development called Sun Studio. If you have the proper shell setup, you can call it by simply typing sunstudio. The program is quite complex, so I can here only outline how to use it for profiling serial and multi-threaded code. An online guide is available at

 file:///opt/SUNWspro/prod/lib/locale/C/html/index.html

on our systems. Other documentation can be found at the Sun Docs Site.

In order to analyze your program with the Sun Studio Tool, you need to compile it with the -g option. After calling

 sunstudio

a GUI will appear. Then click on Analyze on the tool bar, choose File and Collect Experiment, then specify the program on the popup menu. After pressing Run, data from a program run will be collected. After completion, these data will be stored in a file calledtest.1.er and a (hidden) directory called .test.1.er. Now you are ready to have a look at them. Close the sampling collector window and go back to the main sunstudio tool bar. Click on Analyze -> File -> Open Experiment and load test.1.er. You will get an Analyzer window that lets you see the total exclusive and inclusive time spent in various subroutine, the % time used by these, and many more. Try the Metrics and the Callers-Calleeswindows to get more information.

If you do not like GUI's, there is a

 collect

command that lets you produce test.1.er from the command line. Check out the man pages with man collect. And if you prefer a printed report for analyzing the experiment, there is a utility that does that, called

 er_print

also documented in the man pages: man er_print. These come in handy if you do not have a desktop environment available.

This tool lets you analyze where most of the execution time in your program is spent. It can also handle multiple processes which it collects into separate experiments.

It doesn't work. Where can I get help?

All of these things are documented at http://docs.sun.com , but the mass of information on that site makes it a bit difficult to know where to look. Try using the search engine.

If you have questions that you can't resolve by checking documentation, you can Contact us. We have several user support people who can help you with code migration to the parallel environment of the HPCVL facilities. If you want to start a larger project that involves making code executable on parallel machines, they might be able to help you. Keep in mind that we support many people at any given time, so we cannot do the coding for you. But we can do our best to help you get your code ready for multi-processor machines.

Of course, some programs are inherently non-parallel, and trying to make them scalable might be too much effort to be worth it. In that case, the best one can do is try to improve the serial performance by adopting the code to modern computer architecture. The performance enhancement that can be achieved is sometimes quite amazing. It seems, however, that most programs have a good potential to be executed in parallel, and a little effort in that direction often goes a long way.