HPC Basics - Hello World MPI
This is a step by step basic introduction on how to compile and run a trivial mpi code. Its meant for new users starting out with distributed memory parallel processing on the CHPC machines and uses the mvapich distribution of MPI targeted for the InfiniBand interconnect. Since this version of MPI was only available on sanddunearch at the time of this writing, that is the cluster shown in the examples. If you don't have an account or have other policy questions, please see our Getting Started at CHPC guide.
ssh sanddunearch.chpc.utah.edu
Depending upon your default shell, you will want either chpc.tcshrc or chpc.bashrc. You can tell your default shell by typing "ypcat passwd | grep
jdu:*:69999:9999:John D User:/uufs/chpc.utah.edu/common/home/jdu:/bin/bash
Look at the end of the line to see which shell you run. If you run csh or tcsh, then you'll want to get the chpc.tcshrc. If you run bash or ksh you'll want to get the chpc.bashrc. A link to these files is found in the Arches users guide. Here are the exact URLs:
http://www.chpc.utah.edu/docs/manuals/getting_started/code/chpc.tcshrchttp://www.chpc.utah.edu/docs/manuals/getting_started/code/chpc.bashrc
A quick way to get them is using the wget command to above URL.
For example:
[u0108240@sanddunearch2:~]$ wget http://www.chpc.utah.edu/docs/manuals/getting_started/code/chpc.bashrc
--14:53:33-- http://www.chpc.utah.edu/docs/manuals/getting_started/code/chpc.bashrc
=> `chpc.bashrc.2'
Resolving www.chpc.utah.edu... 155.101.3.215, 2001:1948:414:3::d7
Connecting to www.chpc.utah.edu|155.101.3.215|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9,489 (9.3K) [text/plain]
100%[================================================================================================================>] 9,489 --.--K/s
14:53:33 (153.38 MB/s) - `chpc.bashrc.2' saved [9489/9489]
Make sure you have sourced this file. For example, if you run bash:
[u0108240@sanddunearch2:~]$ source chpc.bashrc
If you have a new CHPC account, these files are automatically placed in your your home directory: .tcshrc, .bashrc, .bash_profile, .bash_logout and they should setup your account environment for running on CHPC systems.
If you don't have any source code and just want a simple example, cut and paste this into a file. Call it "hello_1.c" for this example:
hello_1.cThen to compile hello_1.c, first setup the mvapich paths:
source /uufs/sanddunearch.arches/sys/pkg/mvapich/std/etc/mvapich.sh
Check that the path to the compiler you what is correct:
which mpicc
It should return: /uufs/sanddunearch.arches/sys/pkg/mvapich/1.0/bin/mpicc. Then compile it:
mpicc hello_1.c -o hello_1
Run it through Batch, "interactively"
Start batch for a job with 8 processors
qsub -I -l nodes=2:ppn=4
....wait for it...it will eventually return a prompt to you out on a batch node. Once it returns, at the command promt enter the following 2 commands:
source /uufs/sanddunearch.arches/sys/pkg/mvapich/std/etc/mvapich.sh mpirun_rsh -rsh -np $PROCS -hostfile $PBS_NODEFILE $HOME/hello_1
You need to source the mvapich paths again since this is a new terminal and it doesn't remember your environment. Note that you don't have to source the mvapich script if full paths are used, so instead of the above 2 commands, you could enter:
/uufs/sanddunearch.arches/sys/pkg/mvapich/1.0/bin/mpirun_rsh -rsh -np 8 -hostfile $PBS_NODEFILE $HOME/hello_1
In either case it should look something like:
u0108240@sda123:~$ mpirun_rsh -rsh -np $PROCS -hostfile $PBS_NODEFILE $HOME/hello_1 Hello world Hello world Hello world Hello world Hello world Hello world Hello world Hello world u0108240@sda123:~$
Run the program through batch using a script
Create a file. Call it run_hello for this example. It should look something like:
#PBS -S /bin/bash #PBS -l nodes=2:ppn=4,walltime=0:5:00 #PBS -M Julia.Harrison@utah.edu #PBS -N test source /uufs/sanddunearch.arches/sys/pkg/mvapich/std/etc/mvapich.sh mpirun_rsh -rsh -np 8 -hostfile $PBS_NODEFILE $HOME/hello_1
Submit the job:
u0108240@sanddunearch2:arches$ qsub run_hello.script 54254.sdarm.privatearch.arches u0108240@sanddunearch2:arches$
"showq | less"OR
"showq | grepin this case:"
"showq | grep 54254"
You may or may not find your job in the list. This depends on the load on the system. It could submit and run very quickly, since this job is very trivial. At times when the system is extremely loaded, you may need to wait for several minutes or hours.
If all goes well you should receive 2 output files. These files are named like
"ls -l" u0108240@sanddunearch1$ ls -l total 360 -rwxr-xr-x 1 u0108240 chpc 354031 2006-02-10 11:21 hello_1 -rw-r--r-- 1 u0108240 chpc 193 2006-02-10 11:22 run_hello.script -rw------- 1 u0108240 chpc 0 2006-02-10 11:27 TESTING.e54254 -rw------- 1 u0108240 chpc 77 2006-02-10 11:27 TESTING.o54254 u0108240@sanddunearch1$
Notice that the ".e" file is 0 bytes. This is a good sign as there are no errors.
Look at the output file:
[u0108240@sanddunearch1:arches]$ cat TESTING.o54312 Mon May 19 15:45:55 MDT 2008 Hello world Hello world Hello world Hello world Hello world Hello world Hello world Hello world

