CHPC administers a joined license with College of Mines and Earth Sciences (CMES)
which includes 56 Matlab seats and toolboxes listed in FAQ 5 here, which are purchased under the campus Total Academic Headcount (TAH) license. As of R2019a, as a part of the campus TAH license, we also have an unlimited license
of the Matlab Parallel Server which allows one to run in parallel on multiple nodes. The Linux version of Matlab
for CHPC and CMES Linux desktops and clusters is installed in `/uufs/chpc.utah.edu/sys/installdir/matlab/std`

. We also install Matlab on CHPC and CMES Windows and Mac machines on demand.

Researchers with desktop admin rights affiliated with CHPC or CMES can contact helpdesk@chpc.utah.edu for information how to install Matlab. Other CHPC users who need Matlab on their desktops or laptops should purchase Matlab license from OSL.

## Matlab on Linux machines

Matlab, including many toolboxes and DCS is installed on all clusters and Linux desktops in /uufs/chpc.utah.edu/sys/installdir/matlab. There are different versions of Matlab available, which can be accessed by loading the appropriate version module. If the module version is not specified, the latest version will be loaded.

To run Matlab, first open a terminal window and load the Matlab module to set up it
in your environment:`module load matlab`

## Single instance Matlab including Parallel Computing Toolbox on one node

Although Matlab will run on the interactive nodes, please, take note that we don't
allow running executables longer than ca. 15 minutes on the interactive nodes due
to the load they put on them and inconvenience to other users. For that reason, we
recommend to run Matlab through interactive Slurm session, or on the Frisco interactive nodes. To start Matlab on an interactive node call the `matlab`

command from the terminal after loading the Matlab module. In order to use the GUI,
make sure you have accessed the machine through FastX.

Note that running a single Matlab in a job, even if just on one node, may not efficiently utilize the multi-core node. Matlab uses internal multi-threading by default, running as many threads as there are available cores on the node, but, not all internal Matlab functions are multi-threaded. From our experience some Matlab routines thread quite well, while some not much and some are not threaded at all.

This Mathworks document has some information on multi-threading. To evaluate speedup from internal multi-threading,
use the `maxNumCompThreads`

function as described here. One can also run the `top`

command to monitor how much CPU usage Matlab uses, we want to see the MATLAB process
CPU utilization to be up to 100% times number of CPU cores on the node (e.g. on an
8 core node, 800%).

To run one Malab instance in a cluster job, follow these steps:

- Start interactive Slurm session, with X forwarding e.g.:

srun -t 2:00:00 -n 1 -N 1 -p cluster -A account --pty /bin/tcsh -l

- Load Matlab environment and run Matlab
module load matlab

matlab

The above method has one serious limitation - it requires running interactive Slurm job on the compute nodes, which can sometime be difficult to obtain quickly if the cluster utilization is high. It is therefore recommended, once you make sure your Matlab program runs as intended, to run it non-interactively through Slurm scripts.

Our preferred way is to create a wrapper Matlab script to run the program of choice and run this wrapper right after Matlab launch via the -r Matlab start option. The best way to implement this is to create a launch script that has the following three lines:

addpath path_to_my_matlab_script

my_matlab_script

exit

This script adds to Matlab path the path to the program we want to run, run the program
and then exit Matlab. The exit is important since if we don't exit, the Matlab will
hang till the job runs out of walltime. See file `run_matlab.m`

for an example of the wrapper Matlab script.

Once the script is in place, in your Slurm script file, just cd to the directory with
your data files, and run Matlab as:`matlab -nodisplay -r my_launch_script -logfile my_log.out`

Here we are telling Matlab to start without the GUI (as we don't need it in the batch
session), start the launch script `my_launch_script.m`

and log the Matlab output to my_log.out. See file run_matlab.slr for an example of a Slurm script that launches Matlab with the `run_matlab.m`

script.

Alternatively, consider compiling the Matlab programs using the Matlab Compiler and
run them as a standalone executable. In this case, you don't call Matlab in the Slurm
script; call the compiled executable itself (that is, just replace the `matlab -r`

.... line with the name of the compiled executable). The advantage of this approach
is calling a single executable instead of the whole Matlab environment. The disadvantage
is less flexibility in editing the Matlab source between the run since that requires
recompilation of the executable. The compilation itself is an extra step which can
be complicated if the Matlab program is large.

Compiling Matlab program is usually fairly simple. First make sure that all your Matlab
programs are functions, not scripts. Function is a code that starts with function
statement. Suppose we have functions `main.m`

, `f1.m`

and` f2.m`

. `main.m`

is the main function. To compile these three into an executable, do:`mcc -m main f1 f2`

. This will produce executable named main. There are some limitations in the compilation.
For this and other details, consult the Matlab Compiler help page.

Note that if you are running simulatneously more than one Matlab compiled executables,
set the MCR_CACHE_ROOT environment variable to a unique location for each run. This
variable specifies the Matlab Runtime cache location. By default it is `~/.mcrCache`

, which is shared by all the runs, and may lead to the cache corruption. When running
multiple SLURM jobs, set `MCR_CACHE_ROOT=/scratch/local/$SLURM_JOB_ID`

.

When running a single instance of Matlab, the parallelization is limited only to the
threads internal to Matlab. From our experience some Matlab routines thread quite
well, while some not much and some are not threaded at all. It is a good idea to run
the `top`

command to monitor how much CPU usage Matlab uses, we want to see the MATLAB process
to use up to 100% times number of CPU cores on the node.

To run multiple parallel Matlab workers, use the Parallel Computing Toolbox as described in the Parallel Matlab on a desktop section below, or, if you need more workers that can be accommodated by a single node, use the Matlab Distributed Computing Server.

## Local parallel Matlab on a desktop or a compute node

Aside from automatic thread based parallelization, Matlab offers explicit (user implemented)
parallelization with the Parallel Computing Toolbox (PCT). Most common parallelization strategy is replacement of the `for`

loops with parfor. While this requires some changes to the code, often they are not large. Refer to
the topics under the parfor documentation for implementation strategies.

The easiest way to run PCT is directly on a single node. To start PCT use command
`parpool`

with the arguments being the 'local' parallel profile and the number of processors
(called labs by Matlab), e.g. `poolobj=parpool('local',8)`

. Using the 'local' profile will ensure that the parallel pool will run on the local
machine. When you are done, please, exit the parallel pool with `delete(poolobj)`

command, this frees the PCT license for other users. We recommend to have these two
commands embedded in your Matlab code. Just open the parallel pool at the start of
your program and close it at the end.

Please, note that if you are running more than one parallel Matlab session on a shared
file system (e.g. running multiple jobs on our clusters), there is a chance for a
race condition on the file system I/O that results in errors when starting the parallel
pool. To work around this, define unique *Job Storage Location*, as described on this Harvard FASRC help page.

As of Matlab R2014a, the Parallel Computing Toolbox maximum worker limit has been removed, so, we recommend using as many workers as there are physical CPU cores on the system.

## Matlab Parallel Server (MPS)

Matlab Parallel Server, formerly known as Matlab Distributed Computing Server (MDCS), allows to run parallel Matlab workers on more than one node. The job launch requires Matlab running on the interactive node, and launching the parallel job from within the Matlab. Matlab then submits a job to the cluster scheduler and keeps track of the progress of the job.

#### Configuring MPS and jobs

First time users of the MPS, or when setting up a new Matlab version on each new CHPC
cluster, one has to configure Matlab to run parallel jobs on that cluster with the `configCluster`

command. Note that the `configCluster`

command needs to be run only once per cluster, not before every job.

Then prior to submitting a job, other specific parameters need to be defined, some
of which may be unique for the job (such as walltime), and some of which stay the
same so they need to be defined only once (such as user's e-mail that the SLURM scheduler
uses to send e-mails about the job status). All this information is done with the
cluster object's `AdditionalProperties`

parameters after the cluster object `c`

has been created by calling `c=parcluster`

. Some basic understanding of SLURM scheduling is needed to enter the job parameteres.
Please, see our SLURM help page for more information. Below are several important `AdditionalProperties`

, which also support tab completion :

c.AdditionalProperties | display current configuration |

c.AdditionalProperties.EmailAddress = 'test@foo.com'; |
specify e-mail address for job notifications |

c.AdditionalProperties.ClusterName = 'cluster'; |
set name of the cluster (notchpeak, kingspeak, ember, lonepeak, ash) |

c.AdditionalProperties.QueueName = 'partition'; |
set partition used for the jobs |

c.AdditionalProperties.Account = 'account_name' |
set account used for the job |

c.AdditionalProperties.WallTime = '00:10:00' |
set job walltime |

c.AdditionalProperties.UseGpu = true; |
request use of GPUs |

c.AdditionalProperties.GpusPerNode = 2; |
specify how many GPUs per node to use |

c.AdditionalProperties.GpuType = 'k80'; |
request particular GPU |

c.AdditionalProperties.RequireExclusiveNode = true; |
require exclusive node (for nodes that allow job sharing, e.g. the GPU nodes) |

c.AdditionalProperties.AdditionalSubmitArgs = '-C c20'; | set additional sbatch options, in this case constraint to use only 20 core nodes ('-C c20') |

c.AdditionalProperties.MemUsage = '4GB' |
set memory requirements for the job (per worker) |

In the very least, define the `ClusterName`

,`QueueName`

, and `Account`

,.

To save changes after modifying `AdditionalProperties`

, run , `c.saveProfile`

.

To clear a value, assign the property an empty value, e.g. `c.AdditionalProperties.EmailAddress = ''`

.

#### Running independent jobs

Independent serial Matlab jobs can be submitted throught the MPS interface. However, please keep in mind that if node-sharing is not enabled (currently it is not, but plans are to do so in the future), only one SLURM task, and thus one Matlab instance will run on each node, likely not utilizing efficiently all CPU cores on that node. Still, running independent Matlab jobs is a good way to test the functionality of MPS. Additionally, since MPS license comes with all Matlab toolboxes, functions from toolboxes that we don't license can be accessed this way.

To submit an independent job to the cluster, use the batch command. This command returns a handle to the job which can be then used to query the job and fetch the results.

c = parcluster; % get a handle to the cluster

j = c.batch(@pwd, 1, {}); % submit a job, pwd queries where Matlab is running on a cluster, j is a handle to the job

j.State % query the state of the job (e.g. idle, running, finished)

j.fetchOutputs{:} % will display the results if the job is finished.

j.delete % deletes the job

jobs = c.Jobs % displays all the jobs that have been finished and not deleted (queued, running or finished)

c.getDebugLog(j.Tasks(1)) % if the job gives an error, view the error log file

Note that fetchOutputs is used to retrieve function output arguments. If using batch within a script, use
`load`

instead. Data that has been written to files need to be retrieved directly from the
file system.

#### Running parallel jobs

Parallel Matlab jobs use the Parallel Computing Toolbox to provide concurrent execution. The most common way to achieve this is through the parfor loop statement.

For example, if we have a program parallel_example.m, as:

function t = parallel_example

t0 = tic;

parfor idx = 1:16

A(idx) = idx;

pause(2);

end

t = toc;

We can submit a parallel job on the cluster as:

c = parcluster; % Get a handle to a cluster

j = c.batch(@parallel_example, 1, {}, 'Pool', 4); % Submit a batch pool job using 4 workers

j.State % View the job status

j.fetchOutputs{:} % Fetch the job results, after finished state is retrieved

id = j.ID % retrieve the Matlab job ID (MDCS has its own job tracking)

clear j; % clear the handle to the job (handle also gets cleared when quitting Matlab)

Notice that MPS requests # of workers+1 number of SLURM tasks. This is because one worker is required to manage the batch job and the pool of workers. Note also that the communication overhead may slow down the parallel program runtime if too many communicating workers are used. We therefore recommend to time the runtime of your application with varying worker count to find their optimal number.

As Matlab logs information on the jobs ran through MPS, past job information can be retrieved:

c = parcluster; % Get a handle to a cluster

j = c.findJob('ID',4); % Find old job #4

j.State % Retrieve the state of this job

c.getDebugLog(j) % Retrieve output/error log file

A general approach of developing and running Matlab parallel program would be to develop the parallel program in the Matlab GUI with the Parallel Computing Toolbox, and then scale it up to multiple cluster nodes using the MCDS by calling the batch command with the parallel program as the function that the batch command calls.

Note that if you run a program with `parfor`

or other parallelization command without explicit submission with the `batch`

command, Matlab will create a cluster job automatically with the default job parameters
(1 worker and 3 days wall time). This cluster job will continue running when the program
finishes until the 30 minutes Matlab idle timeout is reached. To get a handle to the
parallel pool created by this program and to delete the pool, which deletes the cluster
job, do:

poolobj = gcp('nocreate');

delete(poolobj)

**Difference between parpool() and batch()**

Parallel worker pool can be initiated either with the parpool() or the batch() command.

In a program with parpool(), serial sections of the code are executed in the Matlab
instance that runs the code (e.g. if Matlab runs on the interactive node, the serial
sections of the code will be run there). Parallel sections are offloaded to the cluster
batch job (if the parallel profile defaults to the cluster profile, or is specified
explicitly).

The batch() command starts a cluster batch job from the start of the function that
is specified in the batch command, thus executing both the serial and parallel sections
of the code inside of the cluster batch job, i.e. on the cluster interactive nodes.

Therefore, in order to minimize performance impact on the interactive nodes, users need to submit their parallel Matlab jobs using the batch() command.

The only exception to this rule is if one would run a Matlab job inside of a single compute node as described in section "Local parallel Matlab on a desktop or a compute node".

#### CHPC MPS installation notes

MPS uses MPI for worker communication. Our setup uses Intel MPI in order to use the InfiniBand network on the clusters that have it, as compared to stock supplied MPICH. Intel MPI is picked up automatically.

The MPS integration scripts provided by Mathworks are located in `/uufs/chpc.utah.edu/sys/installdir/matlab/VERSION/toolbox/local/mdcs_slurm`

and added to user path by default.

##### MPS licensing

CHPC now has unlimited workers MPS license through the campus TAH license, therefore the information below is not that important, but, we leave it for reference on how to query the license usage.

Before moving to the campus TAH license, CHPC had a 160 worker license of the MDCS, which means that up 160 workers can run concurrently. However, keep in mind that this license is shared among all the users and clusters. SLURM scheduler can keep track of license usage per cluster, but, not across the clusters. We are running MDCS with SLURM license support, so, SLURM should manage the jobs in such a way that the maximum license count of the running jobs does not exceed 160, but, this is the case only for a single cluster. If some MDCS jobs run on one cluster and other on another, there is a chance that the MDCS license count will get exceeded resulting in an out of licenses message. Therefore, we recommend to check current MDCS license usage on other cluster to get an idea of current license usage.

The slurm command to check the license usage is `scontrol show lic`

, e.g.

[user@ember2 ~/]$ scontrol show lic

LicenseName=matlab_distrib_comp_engine@slurmdb

Total=160 Used=5 Free=155 Remote=yes

One can also query the license server for the current license use, which will list the total license usage on all CHPC clusters.

[user@ember2 ~/]$ $MATLAB_ROOT/etc/glnxa64/lmutil lmstat -S MLM -a |grep MATLAB_Distrib_Comp_Engine

Users of MATLAB_Distrib_Comp_Engine: (Total of 160 licenses issued; Total of 5 licenses in use)

For more information on MPS, see the Mathworks MPS page.