Skip to content

Matlab

CHPC administers a joint license with College of Mines and Earth Sciences (CMES) which includes 56 MATLAB seats and toolboxes listed in FAQ 5 here, which are purchased under the campus Total Academic Headcount (TAH) license.  From R2019a, as a part of the campus TAH license, we also have an unlimited license of the MATLAB Parallel Server which allows one to run in parallel on multiple nodes. The Linux version of MATLAB for CHPC and CMES Linux desktops and clusters is installed in /uufs/chpc.utah.edu/sys/installdir/matlab/std. We also install MATLAB on CHPC and CMES Windows and Mac machines on demand.

Researchers with desktop admin rights affiliated with CHPC or CMES can contact helpdesk@chpc.utah.edu for information how to install MATLAB. Other CHPC users who need MATLAB on their desktops or laptops should purchase MATLAB license from OSL.

Optimizing MATLAB code before moving to CHPC

Before moving one's MATLAB code to CHPC machines, we highly recommend to examine the code and optimize it for performance, following these steps:

Once you do that, chances are that you may not even need to use CHPC!

MATLAB on Linux machines

MATLAB, including many toolboxes and DCS is installed on all clusters and Linux desktops in /uufs/chpc.utah.edu/sys/installdir/matlab. There are different versions of MATLAB available, which can be accessed by loading the appropriate version module. If the module version is not specified, the default version will be loaded. MATLAB follows a biannual release schedule, and we tend to install the latest version shortly after it has been released.

To run MATLAB, first open a terminal window and load the MATLAB module to set up it in your environment:
module load matlab

Strategies for starting MATLAB differ depending on which way one wants to run it, and are described below.
 

Single instance MATLAB on a desktop, interactive node or a single compute node

Although MATLAB will run on the interactive nodes, please, take note that we don't allow running executables longer than ca. 15 minutes on the interactive nodes due to the load they put on them and inconvenience to other users. For that reason, we recommend to run MATLAB through interactive Slurm session, or on the Frisco interactive nodes. To start MATLAB on an interactive node call the matlab command from the terminal after loading the MATLAB module. In order to use the GUI, make sure you have accessed the machine through FastX.

Note that running a single MATLAB in a job, even if just on one node, may not efficiently utilize the multi-core node. MATLAB uses internal multi-threading by default, running as many threads as there are available cores on the node, but, not all internal MATLAB functions are multi-threaded. From our experience some MATLAB routines thread quite well, while some not much and some are not threaded at all.

This Mathworks document has some information on multi-threading. To evaluate speedup from internal multi-threading, use the maxNumCompThreads function as described here. One can also run the top command to monitor how much CPU usage MATLAB uses, we want to see the MATLAB process CPU utilization to be up to 100% times number of CPU cores on the node (e.g. on an 8 core node, 800%).

Running MATLAB in an interactive cluster job

The best way to run MATLAB interactively is through the Open OnDemand web portal. Follow the instructions of the MATLAB interactive application to start the MATLAB GUI in a SLURM interactive job.

Alternatively, to run one MATLAB instance in a cluster job that starts from a terminal command line, follow these steps:

  • Start interactive Slurm session, with X forwarding e.g.:

 

salloc -t 2:00:00 -n 1 -N 1 -p cluster -A account 

In order to start the interactive job quickly, we recommend using the notchpeak-shared-short partition and account, which is optimized for faster job startup.

  • Load MATLAB environment and run MATLAB
    module load matlab
    matlab

Running MATLAB in a SLURM script

Once the MATLAB program is tested in the interactive job session, we recommend to set it up to run non-interactively through Slurm scripts. The advantage of this approach is that it can run on any CHPC cluster in a non-attended fashion, so, one can submit many different calculations which will be processed as the systems resources allow.

Our preferred way is to create a wrapper MATLAB script to run the program of choice and run this wrapper right after MATLAB launch via the -r MATLAB start option. The best way to implement this is to create a launch script that has the following three lines:

addpath path_to_my_matlab_script
my_matlab_script
exit

This script adds to MATLAB path the path to the program we want to run, run the program and then exit MATLAB. The exit is important since if we don't exit, the MATLAB will hang till the job runs out of walltime. See file run_matlab.m for an example of the wrapper MATLAB script.
Once the script is in place, in your Slurm script file, cd to the directory with your data files, and run MATLAB as:
matlab -nodisplay -r my_launch_script -logfile my_log.out

Here we are telling MATLAB to start without the GUI (as we don't need it in the batch session), start the launch script my_launch_script.m and log the MATLAB output to my_log.out. See file run_MATLAB.slr for an example of a Slurm script that launches MATLAB with the run_matlab.m script.

Creating a standalone executable from MATLAB programs

Alternatively, consider compiling the MATLAB programs using the MATLAB Compiler and run them as a standalone executable. In this case, you don't call MATLAB in the Slurm script; call the compiled executable itself (that is, just replace the matlab -r .... line with the name of the compiled executable). The advantage of this approach is calling a single executable instead of the whole MATLAB environment. The disadvantage is less flexibility in editing the MATLAB source between the run since that requires recompilation of the executable. The compilation itself is an extra step which can be complicated if the MATLAB program is large.

Compiling MATLAB program is usually fairly simple. First make sure that all your MATLAB programs are functions, not scripts. Function is a code that starts with function statement. Suppose we have functions main.mf1.m and f2.mmain.m is the main function. To compile these three into an executable, do:
mcc -m main f1 f2. This will produce executable named main. There are some limitations in the compilation. For this and other details, consult the MATLAB Compiler help page.

Note that if you are running simulatneously more than one MATLAB compiled executables, set the MCR_CACHE_ROOT environment variable to a unique location for each run. This variable specifies the MATLAB Runtime cache location. By default it is ~/.mcrCache, which is shared by all the runs, and may lead to the cache corruption. When running multiple SLURM jobs, set MCR_CACHE_ROOT=/scratch/local/$USER/$SLURM_JOB_ID.

Performance considerations

When running a single instance of MATLAB, the parallelization is limited only to the threads internal to MATLAB. From our experience some MATLAB routines thread quite well, while some not much and some are not threaded at all. It is a good idea to run the top command to monitor how much CPU usage MATLAB uses, we want to see the MATLAB process to use up to 100% times number of CPU cores on the node.

To run multiple parallel MATLAB workers, use the Parallel Computing Toolbox as described in the Parallel MATLAB on a desktop section below, or, if you need more workers that can be accommodated by a single node, use the MATLAB Distributed Computing Server.

Parallel MATLAB on a desktop or a single compute node

Aside from automatic thread based parallelization, MATLAB offers explicit (user implemented) parallelization with the Parallel Computing Toolbox (PCT). Most common parallelization strategy is replacement of the for loops with parfor. While this requires some changes to the code, often they are not large. Refer to the topics under the parfor documentation for implementation strategies.

The easiest way to run PCT is directly on a single node. To start PCT use command parpool with the arguments being the 'local' parallel profile and the number of processors (called labs by MATLAB), e.g. poolobj=parpool('local',8). Using the 'local' profile will ensure that the parallel pool will run on the local machine. When you are done, please, exit the parallel pool with delete(poolobj) command, this frees the PCT license for other users. We recommend to have these two commands embedded in your MATLAB code. Just open the parallel pool at the start of your program and close it at the end.

This process can be done either interactively on a desktop, interactive node (friscos) or an interactive job, or, it can be wrapped in a launch script and submitted in an unattended SLURM job script, as described above.

Please, note that if you are running more than one parallel MATLAB session on a shared file system (e.g. running multiple jobs on our clusters), there is a chance for a race condition on the file system I/O that results in errors when starting the parallel pool. To work around this, define unique Job Storage Location, as described on this Harvard FASRC help page.

As of MATLAB R2014a, the Parallel Computing Toolbox maximum worker limit has been removed, so, we recommend using as many workers as there are physical CPU cores on the system.

MATLAB Parallel Server (MPS)

MATLAB Parallel Server, formerly known as MATLAB Distributed Computing Server (MDCS), allows to run parallel MATLAB workers on more than one node. The job launch requires MATLAB running on the interactive node, and launching the parallel job from within the MATLAB. MATLAB then submits a job to the cluster scheduler and keeps track of the progress of the job.

Configuring MPS and jobs

Version R2023a and newer

We recommend to set a new cluster profile for each account:partition and a cluster, and name it after that account:partition. To do so, open the Cluster Profile Manager, and choose the Add Cluster Profile - Clusters Using Third Party Schedulers - Slurm. Once the profile is created, rename it to reflect the account:profile chosen (e.g. notchpeak-shared-short), and modify its SubmitArguments (also called Additional Command Line Arguments for Job Submission) to add the SLURM's -A and -p options, e.g. -A notchpeak-shared-short -p notchpeak-shared-short. For other than default wall time, add also the -w walltime option, e.g. -t 8:00:00. Similarly, other SLURM arguments may be used, such as the GPU request, constraints, etc.

This cluster profile can then be used in the parcluster  command to start the parallel cluster, e.g. c=parcluster("notchpeak-shared-short") .

Version R2022b and older

First time users of the MPS, or when setting up a new MATLAB version on each new CHPC cluster, one has to configure MATLAB to run parallel jobs on that cluster with the configCluster command. Note that the configCluster command needs to be run only once per cluster, not before every job.

Then prior to submitting a job, other specific parameters need to be defined, some of which may be unique for the job (such as walltime), and some of which stay the same so they need to be defined only once (such as user's e-mail that the SLURM scheduler uses to send e-mails about the job status). All this information is done with the cluster object's AdditionalProperties parameters after the cluster object c has been created by calling c=parcluster. Some basic understanding of SLURM scheduling is needed to enter the job parameteres. Please, see our SLURM help page for more information. Below are several important AdditionalProperties, which also support tab completion :

c.AdditionalProperties display current configuration

c.AdditionalProperties.EmailAddress = 'test@foo.com';

specify e-mail address for job notifications

c.AdditionalProperties.ClusterName = 'cluster';

set name of the cluster (notchpeak, kingspeak, ember, lonepeak, ash)

c.AdditionalProperties.QueueName = 'partition';

set partition used for the jobs

c.AdditionalProperties.Account = 'account_name'

set account used for the job

c.AdditionalProperties.WallTime = '00:10:00'

set job walltime

c.AdditionalProperties.UseGpu = true;

request use of GPUs

c.AdditionalProperties.GpusPerNode = 2;

specify how many GPUs per node to use

c.AdditionalProperties.GpuType = 'k80';

request particular GPU

c.AdditionalProperties.RequireExclusiveNode = true;

require exclusive node (for nodes that allow job sharing, e.g. the GPU nodes)
c.AdditionalProperties.AdditionalSubmitArgs = '-C c20'; set additional sbatch options, in this case constraint to use only 20 core nodes ('-C c20')

c.AdditionalProperties.MemUsage = '4GB'

 set memory requirements for the job (per worker)

 

In the very least, define the ClusterName,QueueName, and Account,.

To save changes after modifying AdditionalProperties, run , c.saveProfile.

To clear a value, assign the property an empty value, e.g. c.AdditionalProperties.EmailAddress = ''.

Running independent jobs

 Independent serial MATLAB jobs can be submitted throught the MPS interface. However, please keep in mind that if node-sharing is not enabled (currently it is not, but plans are to do so in the future), only one SLURM task, and thus one MATLAB instance will run on each node, likely not utilizing efficiently all CPU cores on that node. Still, running independent MATLAB jobs is a good way to test the functionality of MPS. Additionally, since MPS license comes with all MATLAB toolboxes, functions from toolboxes that we don't license can be accessed this way.

 To submit an independent job to the cluster, use the batch command. This command returns a handle to the job which can be then used to query the job and fetch the results.

c = parcluster; % get a handle to the cluster
j = c.batch(@pwd, 1, {}); % submit a job, pwd queries where MATLAB is running on a cluster, j is a handle to the job
j.State % query the state of the job (e.g. idle, running, finished)
j.fetchOutputs{:} % will display the results if the job is finished.
j.delete % deletes the job
jobs = c.Jobs % displays all the jobs that have been finished and not deleted (queued, running or finished)
c.getDebugLog(j.Tasks(1)) % if the job gives an error, view the error log file

Note that fetchOutputs is used to retrieve function output arguments. If using batch within a script, use loadinstead. Data that has been written to files need to be retrieved directly from the file system.

Running parallel jobs

Parallel MATLAB jobs use the Parallel Computing Toolbox to provide concurrent execution. The most common way to achieve this is through the parfor loop statement.

For example, if we have a program parallel_example.m, as:

function t = parallel_example
t0 = tic;
parfor idx = 1:16
A(idx) = idx;
pause(2);
end
t = toc;

We can submit a parallel job on the cluster as:

c = parcluster;  % Get a handle to a cluster
j = c.batch(@parallel_example, 1, {}, 'Pool', 4); % Submit a batch pool job using 4 workers
j.State % View the job status
j.fetchOutputs{:} % Fetch the job results, after finished state is retrieved
id = j.ID % retrieve the MATLAB job ID (MPS has its own job tracking)
clear j; % clear the handle to the job (handle also gets cleared when quitting MATLAB)

Notice that MPS requests # of workers+1 number of SLURM tasks. This is because one worker is required to manage the batch job and the pool of workers. Note also that the communication overhead may slow down the parallel program runtime if too many communicating workers are used. We therefore recommend to time the runtime of your application with varying worker count to find their optimal number.

As MATLAB logs information on the jobs ran through MPS, past job information can be retrieved:

c = parcluster; % Get a handle to a cluster
j = c.findJob('ID',4); % Find old job #4
j.State % Retrieve the state of this job
c.getDebugLog(j) % Retrieve output/error log file

A general approach of developing and running MATLAB parallel program would be to develop the parallel program in the MATLAB GUI with the Parallel Computing Toolbox, and then scale it up to multiple cluster nodes using the MPS by calling the batch command with the parallel program as the function that the batch command calls.

Note that if you run a program with parfor or other parallelization command without explicit submission with the batch command, MATLAB will create a cluster job automatically with the default job parameters (1 worker and 3 days wall time). This cluster job will continue running when the program finishes until the 30 minutes MATLAB idle timeout is reached. To get a handle to the parallel pool created by this program and to delete the pool, which deletes the cluster job, do:

poolobj = gcp('nocreate');
delete(poolobj)

Difference between parpool() and batch()

Parallel worker pool can be initiated either with the parpool() or the batch() command.
In a program with parpool(), serial sections of the code are executed in the MATLAB instance that runs the code (e.g. if MATLAB runs on the interactive node, the serial sections of the code will be run there). Parallel sections are offloaded to the cluster batch job (if the parallel profile defaults to the cluster profile, or is specified explicitly).
The batch() command starts a cluster batch job from the start of the function that is specified in the batch command, thus executing both the serial and parallel sections of the code inside of the cluster batch job, i.e. on the cluster interactive nodes.

Therefore, in order to minimize performance impact on the interactive nodes, users need to submit their parallel MATLAB jobs using the batch() command.

The only exception to this rule is if one would run a MATLAB job inside of a single compute node as described in section "Local parallel MATLAB on a desktop or a compute node".

CHPC MPS installation notes

MPS uses MPI for worker communication. Our setup uses Intel MPI in order to use the InfiniBand network on the clusters that have it, as compared to stock supplied MPICH. Intel MPI is picked up automatically.

The MPS integration scripts provided by Mathworks are located in /uufs/chpc.utah.edu/sys/installdir/matlab/VERSION/toolbox/local/mdcs_slurm and added to user path by default.

MPS licensing

CHPC now has unlimited workers MPS license through the campus TAH license, therefore the information below is not that important, but, we leave it below for reference on how to query the license usage.

Before moving to the campus TAH license, CHPC had a 160 worker license of the MDCS, which means that up 160 workers could run concurrently. However, this license was shared among all the users and clusters. SLURM scheduler can keep track of license usage per cluster, but, not across the clusters. We are running MDCS with SLURM license support, so, SLURM should manage the jobs in such a way that the maximum license count of the running jobs does not exceed 160, but, this is the case only for a single cluster. If some MDCS jobs run on one cluster and other on another, there is a chance that the MDCS license count will get exceeded resulting in an out of licenses message. Therefore, we recommend to check current MDCS license usage on other cluster to get an idea of current license usage.

The slurm command to check the license usage is scontrol show lic, e.g.

[user@ember2 ~/]$ scontrol show lic
LicenseName=MATLAB_distrib_comp_engine@slurmdb
    Total=160 Used=5 Free=155 Remote=yes

One can also query the license server for the current license use, which will list the total license usage on all CHPC clusters.

[user@ember2 ~/]$ $MATLAB_ROOT/etc/glnxa64/lmutil lmstat -S MLM -a |grep MATLAB_Distrib_Comp_Engine
Users of MATLAB_Distrib_Comp_Engine: (Total of 160 licenses issued; Total of 5 licenses in use)

For more information on MPS, see the Mathworks MPS page.

 
Last Updated: 3/28/24