Apptainer/Singularity
Apptainer (formerly Singularity) is a container management software useful for building, maintaining, and deploying containers and is the go-to solution for container management in HPC systems. Containers are advantageous for many reasons, most notably:
- Using software built for an unsupported OS
- Software deployment
- Workflow management
- Reproducible research
The main advantage of Apptainer/Singularity over other container management softwares is it's ability to maintain user-level security in an HPC system, particularly through the 'fakeroot' feature. Because of this, the CHPC recommends the use of Apptainer over other softwares and does not recommend Docker to be used. However, apptainer does allow Docker containers to be downloaded and used.
CHPC provides containers for some applications. Users can also bring in their own containers, provided they include a few plugs for our systems, such as mount points for home and scratch file systems. Finally, Apptainer/Singularity also allows to import Docker containers, most commonly from container repositories such as DockerHub.
Below you will find a list of common use cases/various functionalities of Apptainer/Singularity and how they interact with CHPC systems. If you feel that a common use case is missing, please click the 'Provide Feedback' button at the bottom-left corner of this page and let the CHPC know.
Note: As of 2022, Singularity was replaced by Apptainer. Apptainer has the same code base as Singularity but is being developed independently and, as
such, the code base is expected to diverge over time. The |
On this page
The table of contents requires JavaScript to load.
Importing Containers
Apptainer and Singularity have support for most container software, including Docker. Below, we list some common use cases of Apptainer/Singularity, with Docker as an example usecase of containerization software.
With every example below, it is assumed that the Singularity or Apptainer module is loaded:
module load singularity
or
module load apptainer
Downloading a Container
Directly downloading a container can be useful to speed up container startup because Singularity will build a cached Singularity container file at each shell or exec from Dockerhub, which may take a while if the container is large. By downloading the container initially, you save time in the long run.
Below are two different examples for downloading a container, using Biobakery's most recent workflow container as an example.
singularity pull docker://biobakery/workflows:latest
The above will generate a container image called 'workflows_latest.sif'. To download the container and automatically have it renamed as something else, such as 'bioBakery.sif', the singularity build command can be used instead:
singularity build bioBakery.sif docker://biobakery/workflows
The newly created singularity containers can then be used via the `singularity shell` or `singularity exec` commands.
Running a Container Directly in Singularity/Apptainer
To start a shell session inside a Docker container using Singularity/Apptainer, simply point to the container URL.
singularity shell docker://ubuntu:latest
The above command prompts Apptainer/Singularity to scan the host file system and mount them into the container automatically. This allows CHPC's non-standard /uufs and /scratch file systems to be visible in the container as well. This obviates the necessity to create the mount points for these file systems manually in the container and makes the DockerHub containers very easy to deploy with Singularity.
Similarly, we can run a program that's in a DockerHub container as
singularity exec docker://biocontainers/blast:2.2.31 blastp -help
Note that the Biocontainers repositories require the version number tag (following the colon) for Singularity to pull them correctly. The version can be found by finding the tag on the container DockerHub page.
A good strategy to find a container for a needed program is to go to hub.docker.com (or other container hubs, such as BioContainers) and search for the program name.
Modifying a Container
Sometimes it is necessary to make modifications to a container that is downloaded from a public container repository. To modify a pre-built container, we can build the sandboxcontainer, which is a flat file system representation of the container. With this sandbox container, we can shell into the container in a writeable mode, make the modifications necessary, and then re-build the container from the sandbox. This is possible with Apptainer 1.2.5 and newer.
Below is an example on how to do this with the latest Ubuntu container:
module load apptainer
apptainer build --sandbox mycontainer docker://ubuntu:latest
mkdir mycontainer/uufs
apptainer shell -w mcontainer
#... make the necessary installations/modifications and exit the container once completed
apptainer build my-new-container.sif mycontainer
Checking if a Container Already Exists
If desired, the container upload and build can be automated by utilizing a shell script that we wrote, called update-container-from-dockerhub.sh. This script can be run before the container is run to ensure that the latest container version is used without unnecessary uploading if no newer version exists.
It should be noted that updating your container version is not required and may lead to some deviation in the behavior of the software.
Below is an example for automatically checking if there is an updated workflows container from BioBakery.
# check if the container exists or is newer and pull if needed
/uufs/chpc.utah.edu/sys/installdir/singularity3/update-container-from-dockerhub.sh biobakery/workflows bioBakery.sif
# run a program from the container
singularity exec bioBakery.sif humann2 --help
Creating a Module File for a Downloaded Container
By building a custom module from the downloaded container, we can wrap the commands that would be run inside the container into the module, thus making the container easier to use. This enables us to use the program within the container as if it was not in a container.
To do this, you will first create a spot in your home directory to house the module definition file (the .lua file) and copy a .lua template file to that new location:
mkdir -p $HOME/MyModules/my_new_container
cd $HOME/MyModules/my_new_container
cp /uufs/chpc.utah.edu/sys/modulefiles/templates/container-template.lua 1.0.0.lua
Then edit the new module file, 1.0.0.lua
, to modify the container name, the command(s) to call from the container and the
module file meta data:
-- required path to the container sif file
local CONTAINER="/uufs/chpc.utah.edu/common/home/u0123456/containers/my_new_container.sif"
-- required text array of commands to alias from the container
local COMMANDS = {"command"}
-- if you need to export multiple commands, you can do so like (and then removing the '--' to uncomment it):
-- local COMMANDS = {"command1","command2","command3"}
-- these optional lines provide more information about the program in this module file
whatis("Name : Program name")
whatis("Version : 1.0.0")
whatis("Category : Program's category")
whatis("URL : Program's URL")
whatis("Installed on : 10/05/2021")
whatis("Installed by : Your Name")
It can sometimes be difficult to know what commands need to be exported from the container. To find this information, there are two helpful places to search: 1) Look at the program's user guide or help manual - it will typically list the commands that can be run. 2) Go to the location of the program inside of the container and look at it's bin directory - all commands needed will be inside of this bin directory.
When we have the module file created, we can activate the user modules and then load the module:
module use $HOME/MyModules
module load my_new_container/1.0.0
Note: Some packages run programs through scripts, which ultimately call a specific program
binary (for example, in the When this happens, commands such as
To prevent this issue, edit the module file and locate the following line:
Then, modify it by adding the
This ensures that environment variables from outside the container don't interfere with execution inside the container. |
Building Your Own Singularity Container
As of Apptainer version 1.2.5, one can build a container completely in their user space, which means that container builds can be done on CHPC systems. Since large containers require more CPU and memory resources, it is recommended to do so in an interactive job. The basic steps to do this are as such:
salloc -N 1 -n 16 -A notchpeak-shared-short -p notchpeak-shared-short -t 2:00:00 --gres=gpu:1
module load apptainer
unset APPTAINER_BINDPATH
apptainer build --nv mycontainer.sif MyDefinitionFile
Breaking down the commands above, we first ask Slurm for an interactive job session and optionally request a GPU as well. Then, we load the Apptainer module and unset the module pre-set APPTAINER_BINDPATH environment variable; unsetting the APPTAINER_BINDPATH prevents the build process from erroring out due to a non-existent bind path. Next, we build the container based on the definition file called MyDefinitionFile. The --nv flag, which initializes GPU support during the build process, is optional and is only needed for GPU programs to be set up correctly.
Note that in our apptainer definition file (MyDefinitionFile), we define two environment variables:
- APPTAINER_SHELL=/bin/bash - this sets the container shell to bash (easier to use than default sh)
- APPTAINER_BINDPATH=/scratch,/uufs/chpc.utah.edu - this binds mount points to all the /scratch file servers and to /uufs file servers (sys branch, group spaces).
If you prefer to use a different shell, or not bind the file servers, set these variables differently or unset them.
Full Example: Finding and running a Docker container
Frequently, the CHPC receives requests to install complex programs that may not even work on the OS of CHPC Linux machines. Before writing to CHPC, consider following the example below with your application.
A user wants to install a program called guppy, which installation and use are described in a blog post. They also want guppy to run on a GPU since it will perform faster. From the blog post we know the program's name, have a hint on a provider of the program, and how to install it on Ubuntu Linux. After some web searching we find out that the program is mainly available commercially, so it has no publicly available download section and, likely, there is no CentOS version that is compatible with the CHPC's OS - Rocky Linux. That leaves us with a need for an Ubuntu based container.
Our first option is to build the container ourselves based on the instructions in the blog post, but we would need to either a) build with Docker or Singularity on a local machine with root or b) use DockerHub automated build through a GitHub repository. This can be time consuming and cumbersome, so we leave it as a last resort.
We do some more web searching to see if guppy
has a container. First, we search for guppy dockerhub
, we get lots of hits like this one, but none for the version of guppy that supports GPUs (you can see this by looking
at the Dockerfile- there is no mention of GPU in the base image or within whats being installed). Next,
we search "guppy gpu" dockerhub
and find this container. We don't know yet if this container does support GPU, and since the Dockerfile is
missing, we suspect that it is hosted on GitHub. So, we search "guppy-gpu" github
and find this repository, which, based on the repository name and source, looks like a match to the DockerHub
image. Examining the Dockerfile we see that the container is based on nvidia/cuda9.0,
which means it's set up for a GPU. This is looking hopeful so we get the container
and try to run it by following these next steps:
$ module load singularity
$ singularity pull docker://aryeelab/guppy-gpu
$ singularity shell --nv guppy-gpu_latest.sif
$ nvidia-smi
...
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1
... to check if the GPU works
$ guppy_basecaller --help
: Guppy Basecalling Software, (C) Oxford Nanopore Technologies, Limited.
Version 2.2.2
#... to check that the program is there.
Above we have loaded the Singularity module and used Singularity to pull the Docker
container. This has downloaded the Docker container image layers and has created a
Singularity container called guppy-gpu_latest.sif
. Then, we opened a shell in this container (using the --nv
flag to bring in the host GPU stack into the container) and tested the GPU visibility
with the command nvidia-smi
, followed by running the command guppy_basecaller
to verify that the guppy software exists within the container. With these positive
outcomes, we can proceed to run the program with our data, which can be done outside
of the container as such:
$ singularity exec --nv guppy-gpu_latest.sif guppy_basecaller -i <fast5_dir> -o <output_folder> -c dna_r9.4.1_450bps -x "cuda:0"
As mentioned above, the singularity pull
command creates a Singularity container based on a Docker container image. To guarantee
that we will always get the latest version, we can use the shell script we have described
above, e.g.
$ /uufs/chpc.utah.edu/sys/installdir/singularity3/update-container-from-dockerhub.sh aryeelab/guppy-gpu guppy-gpu_latest.sif
$ singularity exec --nv guppy-gpu_latest.sif guppy_basecaller -i <fast5_dir> -o <output_folder> -c dna_r9.4.1_450bps -x "cuda:0"
To make this even easier to use, we build an Lmod module and wrap up the commands to be run in the container in this module. First, create a user based modules. Then, copy our template to the user modules directory:
cd $HOME/MyModules/guppy
cp /uufs/chpc.utah.edu/sys/modulefiles/templates/container-template.lua 3.2.2.lua
and edit the new module file, 3.2.2.lua
, to modify the container name, the command(s) to call from the container, and the
module file meta data:
-- required path to the container sif file
local CONTAINER="/uufs/chpc.utah.edu/common/home/u0123456/containers/guppy-gpu_latest.sif"
-- required text array of commands to alias from the container
local COMMANDS = {"guppy_basecaller"}
-- these optional lines provide more information about the program in this module file
whatis("Name : Guppy")
whatis("Version : 3.2.2")
whatis("Category : genomics")
whatis("URL : https://nanoporetech.com/nanopore-sequencing-data-analysis")
whatis("Installed on : 10/05/2021")
whatis("Installed by : Your Name")
When we have the module file created, we can activate the user modules and load the guppy module:
module use $HOME/MyModules
module load guppy/3.2.2
This way, we can use just the guppy_basecaller
command to automatically run this program inside of the container instead of specifying
it through the singularity exec
command.