Updraft User Guide

Updraft
CHPC Staff finish building Updraft
Updraft
Updraft Cluster

Contents

Updraft Cluster Hardware Overview

  • Capability Cluster for Large Parallel Jobs
  • 256 Dual-Quad Core Nodes (2048 total cores)
  • 2.8 GHz Intel Xeon (Harpertown) processors
  • 16 Gbytes memory per node (2 Gbytes per processor core)
  • Qlogic Infiniband DDR (InfiniPath QLE 7240) interconnect
  • Gigabit Ethernet interconnect

NFS Home Directory

NFS mounted file system, your home directory is one choice for i/o. Speed wise this space carries the worst statistical performance. This space is visible to all of the nodes of the clusters through an auto-mounting system.

NFS Scratch (/scratch/general, /scratch/uintah)

NFS mounted file system There are two NFS scratch filesystems for use on uintah, depending upon your group. If you aren't sure which space to use, it is probably the /scratch/general space. The uintah space is reserved for the users of the uintah allocation and nodes.

This space is visible to all arches nodes and can be access with the path /scratch/general or /scratch/uintah. Each user will be responsible for creating directories and cleaning up after their jobs. This filesystem is not backed up.

Local Disk (/tmp)

Local Scratch space is the storage space unique to each individual node. Local Scratch is cleaned aggressively and is not supported by CHPC. It can be accessed on each node through "/tmp". This space will be the fastest, but not necessarily the largest. Users should use this space at their own risk.

When running jobs, it is important to know that making a flow from one storage system to another is the best idea. For example, taking a job that isn't too large and doesn't need much time on the node should be placed in the "/scratch/general" and then outputted to the user's home directory using a batch job.

It is also important to keep in mind that ALL users must remove excess files on their own. Preferably this can be done with the user's batch job when he/she has finished the computation. Leaving files in any "/scratch/" space creates an impediment to other users who are trying to run their own jobs. Simply delete all extra files from any space other than your home directory when it is not being used immediately.

CHPC resources are available to qualified faculty, students (under faculty supervision), and researchers from any Utah institution of higher education. Users can request accounts for CHPC computer systems by filling out an account request form. This can be found by following the link below or by coming into Room 405, INSCC Building. (Phone 581-5253)

Users requiring priority on their jobs may apply for an allocation of Service Units (SUs) per quarter need to send a brief proposal, using the the allocation form available either:

  • Web version: Allocation form.
  • Hardcopy: from our main office, 405 INSCC, 581-5253.

The Updraft cluster can be accessed via ssh (secure shell) at the following address:

  • updraft.chpc.utah.edu

All CHPC machines mount the same user home directories. That means that user files visible at one cluster will be the same as on other clusters. While this has an obvious benefit of not having to copy files between machines, users must be aware of this fact and make sure they e.g. run correct executables for that particular cluster platform (e.g. running Myrinet MPI executable on Marchingmen, which does not have Myrinet).

Another complication associated with single home directory across all systems is shell initialization scripts (that run before each login and setup environment, paths, ...). Environment and especially paths to applications vary on different clusters.
CHPC created a login script that can determine what machine is being logged into, and perform machine-specific initializations. Second goal of this script is to enable users to turn on/off initialization for specific packages installed on the cluster, e.g. switch between different MPI distributions, initialize variables for usage of Totalview, Gaussian,...


Default .tcshrc login script for CHPC systems
Default .bashrc login script for CHPC systems

Note that those using tcsh shell need the .tcshrc file while those using bash will need .bashrc. In case of bash, users should also create file .bash_profile. The easiest way is to get one that is in /etc/skel/.bash_profile - that is, copy this file in the users home directory.

The first part of each script determines what machine is being logged in based on the machine's operation system, IP address or variable UUFSCELL defined on the system level. The CHPC Linux machines address list is being retreived from the CHPC webserver upon each login and is stored in file linuxips.csh or linuxips.sh. In case webserver is down, there's about a minute long timeout, after which the script either uses the IP address file saved from previous sessions, or, if not available, issues a warning.
The script then finds IP of the machine and does host specific initialization.

Below is an example of tcsh initialization on updraft. It works similarly for all other CHPC clusters. bash syntax is similar but slightly different. Lines starting with # are comments. One can turn on/off specific package initializations by placing the comment at the start of line with source command. Do not comment out lines that don't start with source.

else if ($UUFSCELL == "updraft.arches") then
# Commenting/uncommenting source lines below will disable/enable specified packages
# stacksize by default is very small, which causes programs with large static data to segfault
limit stacksize unlimited

# default path addon
setenv PATH "/uufs/arches/sys/bin:/uufs/$UUFSCELL/sys/bin:$PATH"
setenv MANPATH "/uufs/$UUFSCELL/sys/man:$MANPATH"
...

After the numerous host-specific initialization sections, the last section of the script does a global initialization, that is the same for each machine. Here one can for example set various command aliases, prompt format,...

In case user is mounting the CHPC home directory also on his/her own desktop (most people do), then we recommend to set variable MYMACHINE to that machine's IP address. This address can be found by issuing command hostname -i. For example:


hostname -i
123.456.78.90

we change MYMACHINE line .tcshrc to:

set MYMACHINE="123.456.78.90"

Then look for the $MYIP == $MYMACHINE line in the script, and add selected customizations there.

The batch implementation on all CHPC systems includes PBS, a resource manager, and a scheduler. The scheduler on Updraft is the Moab scheduler.

Any process which runs for more that 15 minutes will need to be run through the batch system.

There are three steps to running in batch:

  1. Create a batch script file.
  2. Submit the script to the batch system.
  3. Check on your job.

Example tcsh Script for Updraft

Note that shell programming is exactly like running commands in the shell. You simply write into the file the commands you would like to run the same way you would write them interactively.

The following is an example script for running in PBS on Updraft. The lines at top of the file all begin with #PBS which are seen as comments to the shell, but give options to PBS and Moab. Please see the options below for the available flags.

Note that in the example below we don't specify queue, there is only one queue on updraft.

In this example the job is be requesting 96 cores or 12, 8 core nodes (ppn means processors per node). We recommend that you you request this way, specifying both node and core counts. See for information more details on the updraft batch policies.

The PBS "-l" option (3rd line) tells PBS and Moab what requirements you need to run your job. You will need to ask for resources (consistent with CHPC policies) and based on what is available.

Regarding wall time, we suggest to use maximum, 24 hours for "general" updraft jobs, for the first run to get an idea how long the run will take, and then specify wall time 10-15% larger than the actual run time. Specifying shorter time one runs into a risk of having the job killed before finishing due to running over the wall time. Specifying too large time, on the other hand, may result in longer wait in the queue due not fitting to smaller free time windows.

You will need to change the email address to your own on the #PBS -M line to your own email address. In the cp commands also change the working_directory to your path.

In the case of any unscheduled downtime (such as power outages, instances where the cooling fails and the systems are taken down quickly) any jobs actively running in the batch queue will be requeued and restarted from the beginning of the job. Note that this will not happen for scheduled downtimes as the queues are drained before the system is taken down.

If this requeueing/restarting from the beginning behavior is NOT acceptable (e.g. if you want to check it first before restarting), you can add the following option to your PBS script:
#PBS -r n

Finally, in this example we suggest of using global scratch space /scratch/general, which is visible to all the nodes. However, it is a shared resource prone to load bottlenecks and it's not being backed up. We suggest users copy important data to their home directories after the executable finishes.

Example PBS Script for Updraft:

#PBS -S /bin/tcsh
#PBS -l nodes=12:ppn=8,walltime=24:00:00
#PBS -m abe
#PBS -M username@your.address.here
#PBS -N jobname

# Create scratch directory on local disk
mkdir -p /scratch/general/$USER/$PBS_JOBID

# Change to working directory
cd /scratch/general/$USER/$PBS_JOBID

# Copy data files scratch directory
cp $HOME/working_directory/data_files /scratch/general/$USER/$PBS_JOBID

# Execute parallel job
# include /uufs/updraft.arches/sys/pkg/mpich/std/bin in your path environment variable
/usr/bin/mpirun -np 96 -machinefile $PBS_NODEFILE $HOME/working_directory/a.out > outputfile
 
# Copy files back home and cleanup
cp * $HOME/working_directory && cd .. && rm -rf /scratch/general/$USER/$PBS_JOBID

PBS uses your default shell so there usually isn't the need to specify which shell to use.

Note that we are running the executable from the working_directory, but reading the input files and writing the output files into the /scratch/general. /scratch/general is recommended as it is shared among the nodes, but is not super fast as it's a single system.

Job Submission on Updraft

Submit your job using the "qsub" command in PBS or the "runjob" command in Moab. See the PBS commands below for additional PBS commands.

For example, to submit a script file named "pbsjob", type

qsub pbsjob

PBS sets and expects a number of variables in a PBS script. For information on these variables and necessities, enter:

man qsub

Checking On Your Job

To check if your job is queued or running, use the "showq" command in Moab.

showq

See the Moab commands below for additional Moab commands.

Moab on Updraft

The Moab scheduler uses information from your script to schedule your job. On Updraft, user groups and priorities are controlled through QOS (Quality of Service). Different QOSes have different maximum wall time. The general QOS, which is the default for most users, has 24 hour wall clock limit. The uintah QOS, which is designed for ICSE users, has also 24 hour wall clock limit. Special QOS for large jobs, bigrun, has 48 hours wall clock limit. Users whose jobs need more than 300 processors can request to be added to the bigrun QOS.

For more details on the Moab batch polcies, please visit the CHPC Batch Policies web page.

Users log into an interactive "front end node" and submit their program in a job script (detailed above) from this machine using PBS. On Updraft there are three locations for storage. Home directory space is common to all nodes of Updraft (and other CHPC systems). Scratch space also has a common name on all nodes, however a physical scratch disk is local on each machine. Each storage area has different environment varibles which make it suitable for different situtations.

Using PBS we are able to manage and control the use of the system to allow for fair usage of all resources. Moreover, with PBS and MPI users do not need specific knowledge of the computer nodes.

PBS Batch Script Options

  • -a date_time.  Declares the time after which the job is eligible for execution. The date_time element is in the form: [[[[CC]YY]MM]DD]hhmm[.S].
  • -e path.  Defines the path to be used for the standard error stream of the batch job. The path is of the form: [hostname:]path_name.
  • -h.  Specifies that a user hold will be applied to the job at submission time.
  • -I.  Declares that the job is to be run "interactively". The job will be queued and scheduled as PBS batch job, but when executed the standard input, output, and error streams of the job will be connected through qsub to the terminal session in which qsub is running.
  • -j join.  Declares if the standard error stream of the job will be merged with the standard ouput stream. The join argument is one of the following:
    • oe-  Directs the two streams as standard output.
    • eo-  Directs the two streams as standard error.
    • n-  Any two streams will be separate(Default).
  • -l resource_list.  Defines the resources that are required by the job and establishes a limit on the amount of resources that can be consumed. Users will want to specify the walltime resource, and if they wish to run a parallel job, the ncpus resource.
  • -m mail_options.  Conditions under which the server will send a mail message about the job. The options are:
    • n: No mail ever sent
    • a (default): When the job aborts
    • b: When the job begins
    • e: When the job ends
  • -M user_list.  Declares the list of e-mail addresses to whom mail is sent. If unspecified it defaults to userid@host from where the job was submitted. You will most likely want to set this option.
  • -N name.  Declares a name for the job.
  • -o path.  Defines the path to be used for the standard output. [hostname:]path_name.
  • -q destination.  The destination is the queue.
  • -S path_list.  Declares the shell that interprets the job script. If not specified it will use the user's login shell.
  • -v variable_list.  Expands the list of environment variables which are exported to the job. The variable list is a comma-separated list of strings of the form variable or variable=value.
  • -V.  Declares that all environment variables in the qsub command's environment are to be exported to the batch job.

PBS User Commands

For any of the commands listed below you may do a "man command" for syntax and detailed information.

Frequently used PBS user commands:

  • qsub Submits a job to the PBS queuing system. Please see qsub Options below.
  • qdel Deletes a PBS job from the queue.
  • qstat Shows status of PBS batch jobs.

Less Frequently-Used PBS User Commands:

  • qalter. Modifies the attributes of a job.
  • qhold. Requests that the PBS server place a hold on a job.
  • qmove. Removes a job from the queue in which it resides and places the job in another queue.
  • qmsg. Sends a message to a PBS batch job. To send a message to a job is to write a message string into one or more of the job's output files.
  • qorder. Exchanges the order of two PBS batch jobs within a queue.
  • qrerun. Reruns a PBS batch job.
  • qrls. Releases a hold on a PBS batch job.
  • qselect. Lists the job identifier of those jobs which meet certain selection criteria.
  • qsig. Requests that a signal be sent to the session leader of a batch job.

Moab Scheduler User Commands

  • showq - displays jobs which are running, active, idling and non-queued.
  • showbf - shows backfill.
  • showstart - shows startime.
  • checkjob - displays status of a job.
  • showres - shows active reservations.

Each command accepts -h flag that displays help.

Moab commands are located in "/uufs/updraft.arches/sys/bin". Please see the Moab Scheduler documentation for more information.

C/C++

Updated: March 1, 2009

The Updraft cluster offers several compilers. The GNU Compiler Suite includes ANSI C, C++ and Fortran 77 compilers. The current version is 3.4.4, that is shipped with RedHat EL 4 that is run on the system.

In addition to GNU compilers, we offer three commercial compiler suites. The Intel compilers generally provide superior performance. They include C, C++ and Fortran 77/90/95.

Additionally, we have a license for the Pathscale compilers and PGI compilers. An advantage of the Pathscale compilers is full interoperability with the GNU compilers, including g77.

The Portland Group Compiler Suite is another good compiler distribution, which we have seen to perform better in some cases than Pathscale. It should also interoperate with GNU, though we have seen problems with execution of some Fortran codes that were linking g77 compiled libraries.

GNU compilers

The GNU distribution is located in the default area, that is, compilers in /usr/bin, libraries in /usr/lib or /usr/lib64, header files in /usr/include,.... The user should not need to do anything else than to invoke the compiler by its name, e.g.:

gcc source.c -o executable

Intel compilers

The latest version of Intel C compilers is located at /uufs/chpc.utah.edu/sys/pkg/intel/icc/std

To find the compiler version, use flag -v, i.e. icc -v.

In order to use the compiler, users have to source shell script that defines paths and some other environment variables.

  • source /uufs/chpc.utah.edu/sys/pkg/intel/icc/std/bin/iccvars.csh  (for csh/tcsh)
  • source /uufs/chpc.utah.edu/sys/pkg/intel/icc/std/bin/iccvars.sh  (for ksh/bash)

The compilers are invoked as icc, icpc and ifort for C, C++ and F90, respectively. For list of available flags, use the man pages (e.g. man icc).

We generally recommend flag -fast for superior performance, however, some of the optimizations using this flag may lose precision for floating-point divides. Consult the icc man page for more details.

For more information on the compiler, visit Intel C++ compiler website.

Documentation including user's guide, language reference, etc. can be found here.

Pathscale compilers

The latest version of Pathscale compilers are located at /uufs/chpc.utah.edu/sys/pkg/pscale/std

To find the compiler version, use flag --version, i.e. pathcc --version.

In order to use the compiler, users have to source shell script that defines paths and some other environment variables.

  • source /uufs/chpc.utah.edu/sys/pkg/pscale/std/etc/pscale.csh  (for csh/tcsh)
  • source /uufs/chpc.utah.edu/sys/pkg/pscale/std/etc/pscale.sh  (for ksh/bash)

The compilers are invoked as pathcc, pathCC and pathf90 for C, C++ and F90, respecti vely. For list of available flags, use the man pages (e.g. man pathcc).

Aggressive optimization is achieved with -O3 -OPT:Ofast. Further performance gain can be achieved with using interprocedural analysis, invoked with -ipa flag, however, there are some limitations with the usage. Contact CHPC staff if you run into problems.

For more information on the compiler visit Pathscale EKOPath site.

Documentation, including whitepapers is at Pathscale documentation site.

PGI compilers

The latest version of Portland Group compilers are located at /uufs/chpc.utah.edu/sys/pkg/pgi/std

To find the compiler version, use flag -V, i.e. pgcc -V.

In order to use the compiler, users have to source shell script that defines paths and some other environment variables.

  • source /uufs/chpc.utah.edu/sys/pkg/pgi/std/etc/pgi.csh  (for csh/tcsh)
  • source /uufs/chpc.utah.edu/sys/pkg/pgi/std/etc/pgi.sh  (for ksh/bash)

The compilers are invoked as pgcc, pgCC, pgf77 and pgf90 for C, C++, F77 and F90, respectively. For list of available flags, use the man pages (e.g. man pgcc).

We generally recommend flag -fastsse for good performance.

For more information on the compiler, visit Portland Group website.

Documentation including user's guide, language reference, etc. can be found here.

Fortran

Updated: March 1, 2009

The Updraft cluster offers several compilers. The GNU Compiler Suite includes ANSI C, C++ and Fortran 77 compilers. The current version is 3.4.4, that is shipped with RedHat EL 4 that is run on the system.

In addition to GNU compilers, we offer three commercial compiler suites. The Intel compilers generally provide superior performance. They include C, C++ and Fortran 77/90/95.

Additionally, we have a license for the Pathscale compilers and PGI compilers. An advantage of the Pathscale compilers is full interoperability with the GNU compilers, including g77.

The Portland Group Compiler Suite is another good compiler distribution, which we have seen to perform better in some cases than Pathscale. It should also interoperate with GNU, though we have seen problems with execution of some Fortran codes that were linking g77 compiled libraries.

GNU compilers

The GNU distribution is located in the default area, that is, compilers in /usr/bin, libraries in /usr/lib or /usr/lib64, header files in /usr/include,.... The user should not need to do anything else than to invoke the compiler by its name, e.g.:

g77 source.c -o executable

Intel compilers

The latest version of Intel Fortran compilers is located at /uufs/chpc.utah.edu/sys/pkg/intel/ifort/std

To find the compiler version, use flag -v, i.e. icc -v.

In order to use the compiler, users have to source shell script that defines paths and some other environment variables.

  • source /uufs/chpc.utah.edu/sys/pkg/intel/ifort/std/bin/ifortvars.csh  (for csh/tcsh)
  • source /uufs/chpc.utah.edu/sys/pkg/intel/ifort/std/bin/ifortvars.sh  (for ksh/bash)

The compilers are invoked as icc, icpc and ifort for C, C++ and F90, respectively. For list of available flags, use the man pages (e.g. man ifort).

We generally recommend flag -fast for superior performance, however, some of the optimizations using this flag may lose precision for floating-point divides. Consult the icc man page for more details.

For more information on the compiler, visit Intel Fortran compiler website.

Documentation including user's guide, language reference, etc. can be found here.

Pathscale compilers

The latest version of Pathscale compilers are located at /uufs/chpc.utah.edu/sys/pkg/pscale/std

In order to use the compiler, users have to source shell script that defines paths and some other environment variables.

  • source /uufs/chpc.utah.edu/sys/pkg/pscale/std/etc/pscale.csh  (for csh/tcsh)
  • source /uufs/chpc.utah.edu/sys/pkg/pscale/std/etc/pscale.sh  (for ksh/bash)

The compilers are invoked as pathcc, pathCC and pathf90 for C, C++ and F90, respecti vely. For list of available flags, use the man pages (e.g. man pathf90).

Aggressive optimization is achieved with -O3 -OPT:Ofast. Further performance gain can be achieved with using interprocedural analysis, invoked with -ipa flag, however, there are some limitations with the usage. Contact CHPC staff if you run into problems.

For more information on the compiler visit Pathscale EKOPath site.

Documentation, including whitepapers is at Pathscale documentation site.

Portland Group compilers

The latest version of Portland Group compilers are located at /uufs/chpc.utah.edu/sys/pkg/pgi/std

In order to use the compiler, users have to source shell script that defines paths and some other environment variables.

  • source /uufs/chpc.utah.edu/sys/pkg/pgi/std/etc/pgi.csh  (for csh/tcsh)
  • source /uufs/chpc.utah.edu/sys/pkg/pgi/std/etc/pgi.sh  (for ksh/bash)

The compilers are invoked as pgcc, pgCC, pgf77 and pgf90 for C, C++, F77 and F90, respectively. For list of available flags, use the man pages (e.g. man pgf90).

We generally recommend flag -fastsse for good performance.

For more information on the compiler, visit Portland Group website.

Documentation including user's guide, language reference, etc. can be found here.

MPI - Message Passing Interface

As Updraft is a distributed memory parallel system, a message passing is the way to communicate between the processes in the parallel program. Message Passing Interface (MPI) is the prevalent communication system and the preferred mode of parallel programming on Updraft.

InfiniPath MPI on Updraft

For best performance, Updraft uses the vendor supplied InfiniPath MPI, which is derived from MPICH. This MPI is installed in the default system location, at /usr/bin and /usr/lib64. InfiniPath MPI uses the Pathscale compilers as default, but, through a simple switch, it can be told to use any other compiler. We thus recommend to use Intel compilers for best performance.

Compile MPI code with Intel compilers as follows:

  • mpif77 -fc=ifort code.f -o executable
  • mpif90 -f90=ifort code.f90 -o executable
  • mpicc -cc=icc code.c -o executable
  • mpicxx -CC=icpc code.cpp -o executable

To compile with other compilers, just replace the compiler name, e.g. for GNU compilers :

  • mpif77 -fc=g77 code.f -o executable
  • mpicc -cc=gcc code.c -o executable
  • mpicxx -CC=g++ code.cpp -o executable

To run, use the default mpirun, and, also, specify the number of processors and the host file:

  • mpirun -np $PROCS -machinefile $PBS_NODEFILE ./executable

$PROCS stands for number of processors used in the run.

Note that the InfiniPath MPI has several different timeouts built in, which can sometimes cause trouble. One such timeout is MPI quiescence timeout - which kills the program if there is no communication for 15 minutes. This is very useful for detecting communication deadlocks, but in case of embarassingly parallel programs, which communicate infrequently, this timeout needs to be disabled. There is a flag in mpirun for this, -q 0.
Also, sometimes for very large jobs, the MPI startup takes longer than the default timeout for the GigE and InfiniPath connections, which are 1 minute by default. If you encounter problems with job startup, we recommend to increase these to 5 minutes via flags -t 360 -I 360.
For more details on these and other options to mpirun command, see its man page or run mpirun --help.

MVAPICH on Updraft

Some programs that we have tried have problems running with InfiniPath MPI. Also, at present, InfiniPath MPI does not support the Totalview debugger. There is one more MPI distribution that supports the native PSM device which the InfiniPath cards on Updraft use and which achieves the best performance - MVAPICH distribution from Ohio State University.
We have built MVAPICH on Updraft with the Intel compilers and tested it on several codes that gave problems with InfiniPath MPI with a positive result. Therefore, we recommend MVAPICH as an alternative to the default InfiniPath MPI.

MVAPICH distribution compiled with Intel compilers is located in:

/uufs/updraft.arches/sys/pkg/mvapich/std_intel

To use MVAPICH on updraft, source the following shell scripts, which set the correct executable and dynamic library path:

  • source /uufs/updraft.arches/sys/pkg/mvapich/std_intel/etc/mvapich.csh (for csh/tcsh)
  • source /uufs/updraft.arches/sys/pkg/mvapich/std_intel/etc/mvapich.sh (for sh/bash)

One can then uses plain compiler names, such as mpicc,mpicxx, mpif77 or mpif90.

Alternatively, full path to MVAPICH scripts with Intel compilers are:

  • /uufs/updraft.arches/sys/pkg/mvapich/std_intel/bin/mpif77
  • /uufs/updraft.arches/sys/pkg/mvapich/std_intel/bin/mpif90
  • /uufs/updraft.arches/sys/pkg/mvapich/std_intel/bin/mpicc
  • /uufs/updraft.arches/sys/pkg/mvapich/std_intel/bin/mpicxx

If you don't have path to Intel compilers set up in your shell initialization, please, make sure to source the Intel compilers shell scripts before compiling and running executables compiled this way:

  • source /uufs/arches/sys/pkg/intel/icc/std/bin/iccvars.csh intel64 (for Intel C/C++ csh/tcsh)
  • source /uufs/arches/sys/pkg/intel/ifort/std/bin/ifortvars.csh intel64 (for Intel Fortran csh/tcsh)
  • source /uufs/arches/sys/pkg/intel/icc/std/bin/iccvars.sh intel64 (for Intel C/C++ sh/bash)
  • source /uufs/arches/sys/pkg/intel/ifort/std/bin/ifortvars.sh intel64 (for Intel Fortran sh/bash)

To run, provided you sourced the MVAPICH shell script, just run:

  • mpirun_rsh -np $PROCS -hostfile $PBS_NODEFILE ./executable

$PROCS stands for number of processors used in the run.

Alternatively, one can specify the full path :

  • /uufs/updraft.arches/sys/pkg/mvapich/std_intel/bin/mpirun_rsh -np $PROCS -hostfile $PBS_NODEFILE ./executable

OpenMP - shared memory programming

All Updraft nodes are dual quad-core processors, which means that shared memory programming can be used on these nodes to save some time on the MPI message overhead. OpenMP is emerging to be the major industry standard for shared memory programming, and is supported by various compilers with command line flag -mp, -openmp or -fopenmp (depending on the compiler). More information on OpenMP can be found in Introduction to programming with OpenMP tutorial presentation.

The Data Display Debugger (ddd) is a graphical interface which supports multiple debuggers, including the standard GNU debugger, gdb. With ddd one can attach to running processes, set conditional break points, manipulate the data of executing processes, view source code, assembly code, registers, threads, and signal states. Man pages for both ddd and gdb are available. For more information visit the following URL:

http://www.gnu.org/software/ddd/

In addition the Portland Group includes a debugger, pgdbg, along with their compiler suite.

Totalview, a de-facto industry standard debugger supports both serial and parallel debugging. For details on how to use Totalview, refer to CHPC's Totalview page.

For serial profiling, there is GNU gprof and Portland Group pgprof. For parallel profiling, there are MPICH bundled upshot and jumpshot and, the recommended commercial product, Intel Trace Collector and Analyzer (ITAC). For details on how to use ITAC, refer to CHPC's Vampir/Vampirtrace Profiler webpage.

Linear algebra subroutines

Updated: July 1, 2004

There are several different BLAS library versions on Updraft, however, we recommend using the Intel Math Kernel Library (MKL) since it is optimized for the Intel processors.

Advanced users may want to experiment with two other BLAS libraries, GOTO Blas and Atlas. Both are being developed and optimized for the latest CPUs and their performance is on par with MKL. For example, we found that the GOTO Blas had about 2% faster performance on the High Performance Linpack (HPL) benchmark than the MKL.

Intel Math Kernel Library (MKL)

MKL is best suited for Intel based processors. Thus, on Updraft, we recommend using MKL for BLAS and LAPACK. We also recommend using Intel Fortran and C/C++ for best performance.

MKL contains highly optimized math routines. It includes full optimized BLAS, LAPACK, sparse solvers, vector math library, random number generators and and fast Fourier transform routines (including FFTW wrappers). For more information, consult Intel MKL webpage. Latest version available is located at /uufs/chpc.utah.edu/sys/pkg/mkl/std.

Compilation instructions:

Intel Fortran

ifort source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/mkl/std/em64t/lib -lmkl -lguide -Wl,-rpath=/uufs/chpc.utah.edu/sys/pkg/mkl/std/em64t/lib

Intel C/C++

icc (or icpc) source_name.c -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/mkl/std/em64t/lib -lmkl -lguide -Wl,-rpath=/uufs/chpc.utah.edu/sys/pkg/mkl/std/em64t/lib

GOTO library

GOTO library by Kazushige Goto is another high performance implementation of BLAS and partial LAPACK. For more information, consult GOTO webpage. We have two builts, one built with GNU compilers and one built with Intel compilers. The Intel build gives a slightly better performance, and should be used preferably. It is located in /uufs/chpc.utah.edu/sys/pkg/goto/std_intel/lib.

The GNU build should be used if user wants to use other compilers than Intel and the Intel GOTO build has problems linking. It is in /uufs/chpc.utah.edu/sys/pkg/goto/std_gnu/lib.

Compilation instructions:

Fortran

ifort (g77, pgf77, pgf90, pathf90) source_name.f -o executable_name /uufs/chpc.utah.edu/sys/pkg/goto/std_intel/lib/libgoto.a -lpthread

C/C++

GNU C/C++:

icc (icpc, gcc, g++, pgcc, pgCC, pathcc, pathCC) source_name.f -o executable_name /uufs/chpc.utah.edu/sys/pkg/goto/std_intel/lib/libgoto.a -lpthread

ATLAS

Automatically Tuned Linear Algebra Software (ATLAS) is an open source library aimed at providing portable performance solution. It provides full BLAS and certain LAPACK routines, which are being tuned to the computer platform at the compilation time. This is the BLAS-compatible library that we recommend using. ATLAS has been optimized for the Opteron platform and from our tests achieves the best performance in a set of BLAS operations. The library is located at /uufs/chpc.utah.edu/sys/pkg/atlas/std/lib.

Compilation instructions:

Fortran

GNU Fortran:

g77 source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -lf77blas -latlas

To include also the LAPACK subset in ATLAS:

g77 source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -llapack -lcblas -lf77blas -latlas

PGI Fortran:

pgf90 (or pgf77) source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -lpgf90blas (or -lpgf77blas) -latlas

To include also the LAPACK subset in ATLAS:

pgf90 (or pgf77) source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -lpgf90lapack (or -lpgf77lapack) -lcblas -lpgf90blas (or -lpgf77blas) -latlas

Pathscale Fortran:

pathf90 source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -lpathf90blas -latlas

To include also the LAPACK subset in ATLAS:

pathf90 source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -lpathf90lapack -lcblas -lpathf90blas -latlas

C/C++

gnu C/C++:

gcc (or g++) source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -latlas -lcblas

PGI C/C++:

pgcc (or pgCC) source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -latlas -lcblas

Pathscale C/C++:

pathcc (or pathCC) source_name.f -o executable_name -L/uufs/chpc.utah.edu/sys/pkg/atlas/std/lib -latlas -lcblas

Portland Group BLAS

Portland Group ships its own version of BLAS library with its compilers. This is the BLAS that will get linked to your source when you use PG compilers with -lblas option. We discourage using it since it is not highly optimized. The libraries are located at $PGI/linux86/lib.

Compilation instructions (for reference only - avoid using them):

PGI Fortran

pgf90 (or pgf77) source_name.f -o executable_name -lblas

PGI C/C++

pgcc (pgCC) source_name.c -o executable_name -lblas