HPC Containerization with Apptainer and Singularity: Complete Guide
Complete guide to HPC containerization with Apptainer/Singularity: installation, building SIF images from Docker Hub and definition files, running with SLURM, MPI integration, and common errors.
Scientific software is notoriously hard to install and reproduce. A computational biology pipeline that runs on one cluster may fail on another because of conflicting library versions, missing dependencies, or incompatible compiler flags. Containers solve this problem by packaging the entire software environment — application, libraries, and configuration — into a single portable artifact.
Singularity to Apptainer: A Brief History
Singularity was developed at Lawrence Berkeley National Laboratory starting in 2015 specifically for HPC use. In 2021, the project was renamed Apptainer when it joined the Linux Foundation. The two names are now often used interchangeably; Apptainer is the upstream open-source project, while Singularity CE (Community Edition) is maintained by Sylabs.
For new HPC deployments, use Apptainer. The singularity command is available as an alias in Apptainer for backward compatibility.
Installation
# RHEL 8/9, Rocky Linux, AlmaLinux
dnf install epel-release
dnf install apptainer
# Ubuntu 22.04
add-apt-repository ppa:apptainer/ppa
apt-get update
apt-get install apptainer
# Compile from source (for systems without package support)
export VERSION=1.3.0
wget https://github.com/apptainer/apptainer/releases/download/v${VERSION}/apptainer-${VERSION}.tar.gz
tar -xzf apptainer-${VERSION}.tar.gz
cd apptainer-${VERSION}
./mconfig && cd builddir && make && sudo make install
Building SIF Images
Method 1: Pull from Docker Hub
# Pull Ubuntu 22.04 from Docker Hub, convert to SIF
apptainer pull ubuntu-22.04.sif docker://ubuntu:22.04
# Pull PyTorch with CUDA support
apptainer pull pytorch-2.2-cuda12.1.sif docker://pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
# Inspect the image
apptainer inspect ubuntu-22.04.sif
apptainer inspect --labels pytorch-2.2-cuda12.1.sif
Method 2: Definition Files
Definition files provide full control over the build process and are essential for reproducible scientific environments.
# openfoam.def — OpenFOAM CFD container
Bootstrap: docker
From: ubuntu:22.04
%labels
Version OpenFOAM-v2312
Maintainer HPC Team
%environment
source /opt/openfoam2312/etc/bashrc
export WM_NCOMPPROCS=8
%post
apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
wget curl software-properties-common
# Add ESI OpenFOAM repository
wget -q -O - https://dl.openfoam.com/add-debian-repo.sh | bash
apt-get update && apt-get install -y openfoam2312-default
# Warm up foam installation
source /opt/openfoam2312/etc/bashrc
foamInstallationTest
# Clean apt cache
apt-get clean && rm -rf /var/lib/apt/lists/*
%runscript
source /opt/openfoam2312/etc/bashrc
exec "$@"
# Build (as root or with fakeroot)
sudo apptainer build openfoam.sif openfoam.def
# or
apptainer build --fakeroot openfoam.sif openfoam.def
Method 3: Interactive Sandbox
# Create writable sandbox directory (for development/debugging)
apptainer build --sandbox ubuntu-sandbox/ docker://ubuntu:22.04
# Enter sandbox with write permission
apptainer shell --writable ubuntu-sandbox/
# Inside sandbox: install software interactively
Apptainer> apt-get update && apt-get install -y python3-pip
Apptainer> pip3 install numpy scipy matplotlib
Apptainer> exit
# Convert finished sandbox to read-only SIF
apptainer build my-python-env.sif ubuntu-sandbox/
Running Containers
# Run a single command inside a container
apptainer exec ubuntu-22.04.sif python3 --version
# Start an interactive shell
apptainer shell ubuntu-22.04.sif
# Run the container's default runscript
apptainer run myapp.sif --input data.txt --output result.txt
# GPU access (NVIDIA)
apptainer exec --nv pytorch-2.2-cuda12.1.sif python3 -c "import torch; print(torch.cuda.is_available())"
# Bind host directories
apptainer exec \
--bind /mnt/scratch/myproject:/data \
--bind /mnt/software/models:/models:ro \
myapp.sif python3 /data/run.py
SLURM Integration
Apptainer containers run transparently under SLURM without any special plugin. The SLURM environment variables ($SLURM_JOB_ID, $SLURM_NTASKS, etc.) are visible inside the container.
#!/bin/bash
#SBATCH --job-name=ml-training
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=1
#SBATCH --gres=gpu:4
#SBATCH --cpus-per-task=32
#SBATCH --mem=256G
#SBATCH --time=12:00:00
# Run PyTorch distributed training in containers across 2 nodes
srun --mpi=pmix \
apptainer exec --nv \
--bind $SLURM_SUBMIT_DIR:/workspace \
/shared/containers/pytorch-2.2-cuda12.1.sif \
python3 /workspace/train.py \
--nnodes 2 \
--node_rank $SLURM_NODEID \
--nproc_per_node 4 \
--master_addr $(scontrol show hostnames $SLURM_JOB_NODELIST | head -1)
Multi-Node MPI Jobs
For MPI jobs that span multiple nodes, the MPI stack inside the container must be compatible with the host MPI:
Approach 1: Hybrid MPI (recommended for most cases)
The host mpirun launches processes, each of which runs the container. The container uses the host MPI libraries via bind mounts:
mpirun -np 256 -hostfile $PBS_NODEFILE \
apptainer exec \
--bind /usr/lib/x86_64-linux-gnu/openmpi:/usr/lib/x86_64-linux-gnu/openmpi \
gromacs.sif gmx_mpi mdrun -v -deffnm production
Approach 2: PMI2/PMIx (for SLURM clusters)
# SLURM with srun handles process launch natively
srun --mpi=pmix_v4 apptainer exec gromacs.sif gmx_mpi mdrun -v -deffnm production
Common Errors
“Permission denied” when creating SIF files on scratch: Some parallel filesystems do not support the filesystem features Apptainer requires for building. Set APPTAINER_TMPDIR to local NVMe scratch and build there.
“FATAL: container creation failed: mount /proc/self/fd: no such file or directory”: The kernel is too old. Apptainer 1.x requires Linux kernel 3.18+. On older RHEL 7 systems, use Singularity CE 3.x instead.
MPI processes can’t find each other across nodes: Ensure the container’s MPI version matches the host MPI version. Use apptainer exec container.sif mpirun --version and compare with the host mpirun --version.
CUDA error: no kernel image is available for execution on the device: The CUDA toolkit version inside the container is higher than the GPU driver on the host. Always build containers with a CUDA version compatible with the cluster’s driver (check with nvidia-smi | grep CUDA).
Containers are one of the most impactful investments an HPC site can make for software reproducibility and user productivity. For Apptainer deployment strategy, image management infrastructure, and SLURM integration, contact the Mevasis team.