Effective, rapid, reproducible use of Singularity

Singularity is a boon for anyone trying to integrate pieces from the exploding data science and machine learning toolset and run them on institutional High Performance Computing (HPC) clusters. Without it, or similar tooling, you might easily break your intricate web of dependencies and have to start from scratch. Without it, you might have to reinstall and recompile dependencies from scratch when between clusters, for example when moving from University level to national level infrastructure. Without Singularity, and you may as well forget about others rebuilding your patchwork of dependencies without budgeting more time than they have going spare to figuring it all out. However, Singularity is not without its potential pitfalls. In this post, I set out some advice for how to use it effectively and ideas for how to combine it with other HPC tools like SLURM and Snakemake.

The three tips are:

  1. Use Dockerfile-based building infrastructure
  2. Use bind mounts to cut down on rebuilds
  3. Use monolithic containers

Use Dockerfile-based building infrastructure

You should build your Singularity images using a Dockerfile rather than a Singularity build definition file. You can use the docker CLI to build them locally, and a number of cloud services to build them as part of a continuous integration pipeline. The main reason to do this is that Dockerfiles are more or less a de-facto standard. An immediate fringe benefit of using Dockerfiles for building is that people may be able to also run your container under Docker.

Docker-compatible tooling is everywhere with support for Dockerfiles goes beyond the open source Docker tooling and Docker company. While DockerHub has begun to limit its free offering, large players such as GitHub, GitLab and OpenShift are still offering free container building and hosting. The Docker building infrastructure ecosystem is not limited to free offerings funded by big tech companies. For example, the national computing infrastructure in Finland, CSC, has an OpenShift/Kubenutes based service, branded as Rahti, which includes continuous integration and container building. Tutorials about how to use these services, and Docker itself as well as base images which you can build on is plentiful, and using. For comparison, for Singularity definition files, there is only the comparatively limited Singularity Hub and a small smattering of coverage outside the Singularity documentation and bug tracker.

One wrinkle with this approach is that it may at first at least seem that some convenient features of the Singularity build definition file are not available in Docker. One notable example is the rather convenient %environment section, which can be used to do all sorts of initialisation for the container. An example of an interesting use of this is Peter Uhrig's OpenPose container which uses the %environment section. However, Docker provides instead the ENTRYPOINT directive, which can be used to emulate the %environment section. First we copy an entrypoint script into our container:

COPY docker/entrypoint.sh /.entrypoint.sh

And then put it as the entrypoint for the Dockerfile:

ENTRYPOINT ["/usr/bin/bash", "/.entrypoint.sh"]

Inside entrypoint.sh we can modify our environment as we wish before executing whatever command is passed to the container to execute:

#!/usr/bin/env bash

<< YOUR STARTUP CODE HERE >>
exec $@

You can see this in action in my own reworking of Peter's container into a Dockerfile.

If all this still sounds a bit odd -- after all, we're running with Singularity, so we should build with Singularity, right? -- note that I'm not alone in coming to the conclusion. For example, Aalto's computing guides gives the same advice.

Use bind mounts to cut down on rebuilds

Now we've got our Singularity container, every time we make a change, we have to rebuild the container ---right? Not so! Ideally, there should be as little testing in the cluster as possible. This means testing locally on your own machine, usually using small development data for as long as is possible. However, there come times where you are having to continually fix problems that occur only in the cluster environment. For example, you might not have access to CUDA locally. Now that we're using Singularity, we have to wait for our container to rebuild and finish pulling before we can test it! This is going to take forever! Luckily this is not usually the case.

Consider the case of a simple Dockerized Python project. Our project directory might look like this:

/
/requirements.txt
/workflow/
/workflow/Snakemake
/mymodule/
/mymodule/__init__.py
/mymodule/code.py

And our minimal Dockerfile might look like this:

FROM python:3.9.2-slim

WORKDIR /opt/myproj
COPY requirements.txt /opt/myproj/
RUN pip install --no-cache-dir -r requirements.txt
COPY . /opt/myproj
RUN echo "/opt/myproj" > /usr/python3.9/site-packages/myproj.pth

Normally we run our Singularity container like so:

singularity run mycontainer.sif python -m mymodule

The main idea is that when we update our code.py we can just overwrite the file within our singularity container by binding over it. We can however go further and just bind all files which you are working on. In this situation your working directory on CSC is laid out like so:

/
/run.sh
/mycontainer.sif
/binds/

Then, we simply modify our run.sh to --bind in our code directories:

singularity run mycontainer.sif --bind binds/mymodule:/opt/myproj/mymodule,binds/workflow:/opt/myproj/workflow python -m mymodule

Then, on our development machine we can create a sync.sh script to either run rsync, which synchronises our files once or lsyncd which continually synchronises files. So it contains, e.g.

rsync -a mymodule workflow hpc.myuni.example:/path/to/workdir/binds/

Or, e.g.,

lsyncd -rsyncssh mymodule workflow hpc.myuni.example /path/to/workdir/binds/

While we're on the subject of testing in the cluster environment, if your SLURM cluster offers a testing partition, make use of it! It will queue faster and you will get some kind of sanity check on your code faster.

Use monolithic containers

So far the advice in this post has been fairly pedestrian and broadly applicable. This last point may be a bit more contentious and less broadly applicable.

The main idea is this: part of the reason you containerised your pipeline code is so that it's reproducible. For a data analysis and processing pipeline, the easiest to reproduce a pipeline is if the whole thing runs or reruns with a single command. Snakemake allows data analysis pipelines to be specified declaratively as a number of steps structured as a dependency tree of input and output files, similar to the UNIX tool make. HPC carpentry gives a nice tutorial introduction to Snakemake. It allows for a certain amount of decoupling of the specification of the computational resources in the SLURM cluster needed for a job from the job steps themselves.

Conversely, in terms of distributing executable code, the easiest way to receive it for the purposes of reproduction is either as a single blob containing all dependencies or single recipe to build said blob. Thus we should use a monolithic container for our whole analysis.

This is in contrast to the Singularity support Snakemake offers out of the box which wraps individual analysis pipeline stages in Singularity containers, essentially working a more granular level. This existing approach is probably better in cases where multiple containers must be used due to version conflicts in software, or individual containers for build steps have already been created.

I have implemented a wrapper to help integrate monolithic container approach which enables this approach called singslurm2. To get started you can perform a recursive clone into your home directory:

$ cd ~
$ git clone --recursive https://github.com/frankier/singslurm2.git

Taking the same example as given above, we can run our Snakefile like so:

#!/bin/bash

SIF_PATH=/path/to/mycontainer.sif \
SNAKEFILE=/opt/myproj/workflow/Snakefile \
CLUSC_CONF=`pwd`/clusc.json \
SING_EXTRA_ARGS="--bind binds/mymodule:/opt/myproj/mymodule,binds/workflow:/opt/myproj/workflow" \
NUM_JOBS=64 \
~/singslurm2/run.sh all

What this says is to run the Snakemake workflow at /opt/myproj/workflow/Snakefile in the container /path/to/mycontainer.sif. The same binds as given in the example above are now passed using SING_EXTRA_ARGS, while NUM_JOBS gives the maximum number of parallel jobs. The file clusc.json is a JSON file defining which resources different steps should ask for, in [exactly the same format as used by the SLURM Snakemake profile](https://github.com/Snakemake-Profiles/slurm], which is used behind the scenes. For example, we might have:

{
    "__default__": {
        "mail-user" : "frankie@robertson.name",
        "time" : "05:00:00",
        "nodes" : 1,
    },
    "train": {
        "partition": "gpu",
        "time" : "01:00:00",
        "cpus-per-task": 4,
        "mem": "32G",
        "gres": "gpu:v100:1"
    }
}

What's happening behind the scenes? Inside the container, a Snakemake profile is loaded which sends requests to run various SLURM commands through the filesystem. Outside the container, a small server written in bash listens for these and executes them.

One situation in which this approach isn't really applicable is when we are not running a static analysis pipeline. One common situation in machine learning which does not fit into this framework is hyperparameter optimisation. I have begun experimenting with implementing a similar approach which works analogously to the above; It packages the control process and worker process inside the same Singularity container. In this case the control process is composed of the Ray distributed computing and hyperparameter tuning framework and the yaspi SLURM wrapper and the worker process is composed of Ray workers.

The key advantage here is that we get to take control of our outermost loops by containerising them. This potentially means less steps to reproduce. I look forward to seeing if others find this approach useful. I would be happy to hear any comments at firstname@secondname.name.