Containers Introduction

Virtualization vs Containerization vs Modules vs Conda

2025-10-30

Laurent Jourdren

What is a container?

Operating system–level virtualization (container) is a virtualization method where the kernel of an operating system allows for multiple isolated user space instances, instead of just one.

Source: Wikipedia

Kernel mode vs user mode

With an oversimplification, an operating system can be devided in two parts:

Kernel mode that provide an abstraction of the hardware, process, and memory management
User mode where application are running

Kernel Layout Source: Wikipedia

Hardware virtualization vs OS virtualization

Virtualization

Examples:

VMWare (1999)
VirtualBox
KVM (Linux)
HyperV (Windows WSL2)
Hypervisor Virtualization Framework (macOS)

Containerization

Examples:

chroot (1982)
FreeBSD Jail (2000)
Linux LXC (2008)
Docker (2013)
Singularity (2015)
PodMan (2018)
Apptainer (2021)

Container Pro/Con

Pros

Can mix Linux distributions and versions (e.g. Ubuntu and CentOS)
OS virtualization is faster (little or no overhead) than Hardware virtualization
No boot time with OS virtualization
Can run a container inside a virtual machine

Cons

Container must run the same kernel than the host, cannot mix OS (e.g. Windows and Linux)
No RAM snapshot

Docker

In the next slides, we will see how containerization works using Docker, which defined how modern containerization works in 2013.

Open Source:

Docker CLI
Docker Compose
containerd
runC

Closed Source/Freemium/Paid subscription:

Docker Desktop
Docker Hub images storage
Docker Hub advanced features (Pro, Team, and Business plans)

Same kernel on “host” and in the container

Information about the host:

$ cat /etc/os-release | grep '^VERSION='
VERSION="24.04.3 LTS (Noble Numbat)"
$ uname -rv
6.14.0-29-generic #29~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Aug 14 16:52:50 UTC 2

Information about a CentOS 7 container:

$ docker run centos:7 cat /etc/os-release | grep -e '^VERSION='
VERSION="7 (Core)"
$ docker run centos:7 uname -rv
6.14.0-29-generic #29~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Aug 14 16:52:50 UTC 2

Container cannot work on another OS/arch

Attempt to run a Windows container on Linux:

$ docker run mcr.microsoft.com/windows:ltsc2019
Unable to find image 'mcr.microsoft.com/windows:ltsc2019' locally
ltsc2019: Pulling from windows
docker: no matching manifest for linux/amd64 in the manifest list entries.
See 'docker run --help'.

Attempt to run an ARM64 Linux container on x86-64:

$ docker run arm64v8/ubuntu:24.04
WARNING: The requested image's platform (linux/arm64/v8) does not match the detected host
platform (linux/amd64/v3) and no specific platform was requested
exec /bin/bash: exec format error

But, wait I can use my favorite Linux images on my new macBook? 🤔

When using Docker Desktop on a Mac (or Windows), you automatically:

Launch a Linux VM on your Mac
Configure access to filesystem and network of the Linux VM
and macOS translate x86-64 instructions into ARM64 (≈ emulation)

⚠️ The macOS x86-64 to ARM64 translation layer (Rosetta) will be removed from macOS with version 28 in 2027.

Image vs container

In programming oriented object analogy:
- Image ≈ Class
- Container ≈ Object

Modifications of running images are lost after the end of the container

$ docker run ubuntu:24.04 touch /toto
$ docker run ubuntu:24.04 ls /toto
ls: cannot access '/toto': No such file or directory

→ Container images are immutable.

Launching a Docker container (1)

Launch a command in a container with the docker run command

$ docker run ubuntu:24.04 ps -aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  8.3  0.0   7808  3920 ?        Rs   12:35   0:00 ps -aux

Launch Bash in interactive mode (-t -i options)

$ docker run -t -i ubuntu:24.04 bash
root@c8e01ed654df:/# ps -aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.1  0.0   4560  3724 pts/0    Ss   09:31   0:00 bash
root           8 33.3  0.0   7808  3920 pts/0    R+   09:31   0:00 ps -aux

Mount /tmp in /root of the container (-v option)

$ docker run -t -i -v /tmp:/root ubuntu:24.04 bash

Run as non-root user in container (-u option)

$ docker run -t -i -v /tmp:/root -u 1001:1001 ubuntu:24.04  bash

Launching a Docker container (2)

Launching bash as user in the current directory (-w option)

$ docker -ti \
         --rm \
         -v "$(readlink -f .):$(readlink -f .)" \
         -w "$(readlink -f .)" \
         -u $(id -u):$(id -g)
         ubuntu:24.04 \
         bash

Docker image repositories (1)

Container images (e.g. Docker images) can be stored in remote repositories.

The main Docker repository is hosted by Docker Inc. (Docker Hub https://hub.docker.com).

Docker image repositories (2)

However, there are many Docker Hub repositories alternatives:

GitHub/GitLab container registry (Authentication is required)
Amazon Elastic Container Registry
Red Hat Quay
Biocontainer (https://biocontainers.pro), free, use Red Hat Quai for its infrastructure
…
Private repositories (self-hosted, easy to deploy as it is just a Docker container to start)
- Launching a private Docker repository
```
$ docker run -d -p 5000:5000 --restart=always --name mon-registry registry:2
```

Creating an image from an existing container

Create a container and change the content of the filesystem

$ docker run -ti ubuntu:24.04 bash
root@4c15d4118b4f:/# echo "Hello world!" > /hello.txt
root@4c15d4118b4f:/# exit
exit

Show the changes on the filesystem

$ docker diff 4c15d4118b4f
A /hello.txt
C /root
A /root/.bash_history

Commit the changes to a new Docker image

$ docker commit 4c15d4118b4f wf4bioinfo/bibon:0.0.1
sha256:4b682bf8155fdc51466ed6eec9a000511502016eed7115d68e321e1f7b8c51be

Push the created image on a repository

Push the new image on the Docker hub (require an account on Docker Hub)

$ docker push wf4bioinfo/bibon:0.0.1
The push refers to repository [docker.io/wf4bioinfo/bibon]
b095a460bd9e: Layer already exists
5f94273e386b: Layer already exists
0.0.1: digest: sha256:822ce5c0fc67586ec203daab7b9010a2d135b8a617355eb69e9d78c334f3e784 size: 736

Push the new image on alternate repository

$ docker tag wf4bioinfo/bibon:0.0.1 mydockerrepo.org:5000/wf4bioinfo/bibon:0.0.1
$ docker push myrepo.org/wf4bioinfo/bibon:0.0.1

Docker image layers

A Docker image is composed of layers of commits

Source: Ridwan Shariffdeen

Dockerfile

Create a “Dockerfile” file in an empty directory:

FROM ubuntu:24.04

ARG VERSION=0.3.33
RUN apt update && \
    DEBIAN_FRONTEND=noninteractive apt install --yes \
                    python3 \
                    python3-pip && \
    apt clean && \
    rm -rf /var/lib/apt/lists/*
RUN pip3 install "pod5==$VERSION"

Build the image

$ docker build -t wf4bioinfo/pod5:0.3.2 .

Push the new image on the Docker hub

$ docker push wf4bioinfo/pod5:0.3.2

Docker advanced options

--name define a name for the container
--rm automatically remove the container when it exits
-e define an environment variable in container
-p define a port mapping between host and container

$ docker run --rm \
           --name postgresql \
           -e POSTGRES_PASSWORD=pgpassword \
           -e POSTGRES_USER=pguser \
           -e POSTGRES_DB=my_db -\
           -p 5432:5432 \
           postgres:16

--gpus allow usage of GPUs by the container

$ docker run --rm --gpus all nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu18.04 nvidia-smi
GPU 0: NVIDIA RTX A6000 (UUID: GPU-0f985e54-0059-e6eb-20ba-8920319a4f24)
GPU 1: NVIDIA RTX A6000 (UUID: GPU-c3118007-b9fc-2d85-997a-92cd00f18cca)

Docker useful commands

list all containers (even if they are stopped)

$ docker ps -a

List images on the system

$ docker images

Remove a container

$ docker rm 4c15d4118b4f

Remove an image

$ docker rmi postgres:16

top like command for Docker containers

$ docker stats

Docker vs Podman

Podman is free software alternative to Docker made by Red Hat and is usually faster at startup than Docker as it does not use a Daemon on the computer.

One of Podman’s greatest advantages is its CLI compatibility with Docker.

$ alias docker=podman

You can run docker pull, docker run… but launch Podman instead of Docker.

Podman can work in rootless mode to be more secured (Docker can now also work in rootless mode). Podman and Docker use the same underliying library (runC) to execute containers.

You need to start a Podman service if you want to use the Docker API that it used by many programming language libraries.

Source: kevsrobots.com

Docker vs Singularity/Apptainer (1)

What is the differences between Singularity and Apptainer:

SingularityPro: commercial software by Sylabs.
SingularityCE: open source Singularity supported by Sylabs.
Apptainer: open source Singularity, renamed in 2021 and hosted by the Linux Foundation.

Singularity (and Apptainer) has been developed to bring containers and reproducibility to high-performance computing (HPC) world, as Docker required to be root at start of container. The other features of Singularity/Apptainer are:

Easier integration with resource managers (like SLURM…) as it runs as a regular application.
Single-file based container images (no images and container storage in /var/lib/docker).
Preserves the permissions in the environment.
Less starting overhead than with Docker/Podman (No Daemon).

Docker vs Singularity/Apptainer (2)

Executing the consay command using apptainer

$ apptainer exec lolcow_latest.sif cowsay moo
 _____
< moo >
 -----
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Launch a container in interactive mode

$ apptainer shell /tmp/Debian.sif
Apptainer/Debian.sif>

Use the --bind option to mount a directory into the container

$ apptainer exec --bind /data:/mnt my_container.sif ls /mnt

Docker vs Singularity/Apptainer (3)

Run a Docker image using Apptainer.

$ apptainer run --containall docker://alpine

Note: The --containall option allow option to isolate the container from the host. It is very useful with Conda containers to avoid using the Conda installation of the host inside the container.

Environment Module (1)

Environment Modules (or Modules) provides a mechanism for managing and switching between sets of environment variable settings (like $PATH, $MANPATH, etc.), often used to configure different software packages, compilers, and libraries. Modules is not a container software.

Modules can be loaded and unloaded dynamically and atomically, in a clean fashion. This tool is typically used in high-performance computing (HPC) world like on the IFB-core cluster.

The disadvantage of Environment Modules is that Module cannot be easilly shared between multiple cluster. Reproducility is not garanteed from a cluster to another.

# Load GCC module
$ module load gcc/12.4.0
$ which gcc
$ /usr/local/gcc/12.4.0/linux-x86_64/bin/gcc

# Switch to GCC 14
$ module switch gcc/14
$ which gcc
/usr/local/gcc/14.2.0/linux-x86_64/bin/gcc

# Unload GCC module
$ module unload gcc
$ which gcc
gcc not found

Environment Module (2)

# List loaded modules
$ module list
 Currently Loaded Modules:
  1) gcc/14

# Unload all loaded software components
$ module purge


# List all available modules
$ module avail

Conda (1)

Conda works like Environment Modules by managing environment variable settings ($PATH).

However, it introduces:

Virtual environments (with conda create -n my_environment)
Remote repositories named “channels” (conda-forge, bioconda…)
YAML recipes to describe how to build a Conda package

Many Conda’s concept come from Python tooling, but Conda can works with any type of packages (Python, R, Java…) unlike PIP or virtualenv.

Conda is not a container software.

Conda (2)

Some useful Conda commands

# Automatically start Conda at startup of sheel (e.g. write in ~/.bashrc)
$ conda init

# Create an environment with a specific version of Python
(base) $ conda create -n myenv python=3.9

# Install a specific version of a package
(base) $ conda install -n myenv scipy=scipy=1.13.1

# Activate an environment
(base) $ conda activate myenv

# Deactivate the environment
(myenv) $ conda deactivate
(base) $ 

# Export package list of the current environment 
(base) $ conda list --export > packagelist.txt

# Create a new environment with packages defined in a file
conda create -n superenv --file packagelist.txt

Conda recipe

New package are build using recipe named meta.yaml.

context:
  # we define named variables in the context instead of `{$ set … %}` directives
  version: "23.0.0"

package:
  name: "boltons"
  # note that we use "GitHub" inspired syntax to access context / Jinja variables
  version: ${{ version }}

source:
  url: https://github.com/mahmoud/boltons/archive/refs/tags/${{ version }}.tar.gz
  sha256: 9b2998cd9525ed472079c7dd90fbd216a887202e8729d5969d4f33878f0ff668

build:
  noarch: python
  script:
    - python -m pip install . --no-deps -vv

requirements:
  host:
    - python
    - pip
    - setuptools
  run:
    - pip

about:
  license: BSD-3-Clause
  license_file: LICENSE

Conda (3)

Drawbacks:

Store all packages and environment in user’s home directory
Change your .bashrc file
Complex ecosystem:
- Package Management Tools: Conda, Mamba, Micromamba
- Distributions: Anaconda, Miniconda, Miniforge, Mambaforge
- Package sources: Anaconda, conda-forge, bioconda
Reproducibility
- Conda environments are not locked (versions of dependencies can and do change over time) and packages can change
- Unused packages are removed from channels
- Nf-core: Please only use Conda as a last resort” because there is small issue in checking the hashed outputs of files in nf-code/module

Bioconda and Biocontainers

Bioconda is a package repository that contains thousands of software packages related to biomedical research using the conda package manager.

It is open-source, and you can contribute to it on GitHub: https://github.com/bioconda/bioconda-recipes

In addition, Bioconda packages can be automatically converted in Docker images with the Biocontainers project.

Biocontainer images are the default images used by NF-core and Snakemake wrappers.

Conclusion

To conclude, for reproducibility and workflow sharing, it is better to use (by order of preference) for dependancy packaging:

Containers
1. Singularity or Podman
2. Docker
Conda
Environment Modules
No packaging of dependancies (can work with some script of static executables)