Creating and Using Module Overlays

πŸ“¦ CondaTainer allows you to create module overlays for your project.

They are stackable, read-only, and highly compressed overlays that contain app or reference data, which is ideal for HPC environments where inode usage is a concern.

Please read Concepts before proceeding.

πŸ‘€ Quick Look

condatainer avail # List available recipes
condatainer list  # List installed

# Install and Run
condatainer create samtools/1.16
condatainer exec -o samtools/1.16 samtools --version

# Create a read-only project overlay from a YAML file
condatainer create -f environment.yml -p my_analysis
condatainer exec -o my_analysis.sqf bash

# Automatically install the dependencies
condatainer check analysis.sh -a

# Print helpful info when running with overlays
condatainer exec -o grch38/cellranger/2024-A bash
# [CNT] Overlay envs:
#   CELLRANGER_REF_DIR: cellranger reference dir
#   GENOME_FASTA      : genome fasta
#   ANNOTATION_GTF_GZ : 10X modified gtf
#   STAR_INDEX_DIR    : STAR index dir

πŸ“₯ Installing Module Overlays

Use condatainer avail to browse available build scripts, then condatainer create to install.

Normal Build Script Module

A normal build script targets a fixed name and version. Install it directly:

condatainer avail cellranger           # search available scripts
condatainer create cellranger/9.0.1    # app module
condatainer create grch38/genome/gencode  # data module

Template Script Module

A template script covers many versions through placeholders. It appears in avail output as a collapsed group:

grch38/salmon-gencode  [486 variants]
  β†’ grch38/salmon/{salmon_version}/gencode{gencode_version}
  - salmon_version:   1.0.0-1.11.4  (18 values)
  - gencode_version:  23-49  (27 values)

There are two ways to install:

1. Use the template name β€” CondaTainer prompts for each placeholder interactively:

condatainer create grch38/salmon-gencode
# [CNT] Placeholder template: grch38/salmon-gencode
# [CNT] Salmon GRCh38 GENCODE{gencode_version} index for transcript quantification
# Target: grch38/salmon/{salmon_version}/gencode{gencode_version}
#   salmon_version [1.0.0-1.11.4] (default: 1.11.4): 1.10.2
#   gencode_version [23-49] (default: 49):
#   β†’ Creating grch38/salmon/1.10.2/gencode49

2. Specify the target directly β€” skip the prompts by providing the resolved name:

condatainer create grch38/salmon/1.10.2/gencode47

Tip

When entering placeholder values, you can hit to see the available options and autocomplete.

πŸš€ Dependencies Automation

🧬 Data Overlay Installation

To Install a Salmon Index Overlay. You don’t need to:

  • Load modules or install Salmon manually.

  • Manually download genome FASTA and transcript FASTA files.

  • Create decoy FASTA.

  • Build the Salmon index and submit scheduler jobs.

Condatainer will handle all these steps for you automatically!

condatainer create grch38/salmon/1.10.2/gencode47
# This command will:
# - Create Salmon 1.10.2 module overlay
# - Download GRCh38 genome FASTA as overlay
# - Download Gencode 47 transcript FASTA as another overlay
# - Submit scheduler jobs to build the Salmon index using these overlays

🏷️ Declaring Dependencies in Scripts

Declare dependencies with #DEP: tags at the top of your script.

Example Script (analysis.sh):

#!/bin/bash
#DEP: salmon/1.10.2
#DEP: grcm39/salmon/1.10.2/gencodeM33

salmon quant -i $SALMON_INDEX_DIR ...

Note

CondaTainer will automatically set environment variables like SALMON_INDEX_DIR.

If you don’t know the variable names, you can use info to check:

condatainer info grcm39/transcript-gencode/M9
# Environment
#  - TRANSCRIPT_FASTA=/cnt/grcm39/transcript-gencode/M9/gencode.vM9.transcripts.fa
#    # GRCm39 GENCODE vM9 transcript

Check and auto install dependencies:

# Print dependencies and install status
condatainer check analysis.sh

# Auto install missing overlays
condatainer check analysis.sh -a

Execute the script with CondaTainer:

# Run with CondaTainer
condatainer run analysis.sh

πŸ€– Scheduler Automation

When you request a reference or an environment that requires significant computation to prepare, CondaTainer will automatically submit scheduler jobs (SLURM, PBS, LSF, or HTCondor) to handle the heavy lifting for you.

Example Script (analysis.sh):

#!/bin/bash
#SBATCH --time=2:00:00
#SBATCH --cpus-per-task=1
#SBATCH --mem=1GB
#DEP:samtools/1.22.1

samtools --version

Install missing dependencies first:

condatainer check -a salmon_quant.sh

After all dependencies are installed, submit as a job:

condatainer run salmon_quant.sh

If no scheduler directives are found or job submission is disabled, the script will run immediately in the current shell.

🧫 Case Study: Cellranger Count

πŸ“œ Count Script

The following is an example scheduler script (SLURM) for running cellranger count using the cellranger overlays.

#!/bin/bash
#SBATCH --job-name=cellranger-quant
#SBATCH --time=6:00:00
#SBATCH --cpus-per-task=16
#SBATCH --mem=64GB
#DEP: cellranger/9.0.1
#DEP: grch38/cellranger/2024-A

cellranger count --id=sample1 \
  --transcriptome=$CELLRANGER_REF_DIR \
  --fastqs=/path/to/fastqs \
  --sample=sample1 \
  --localcores=$NCPUS \
  --localmem=$MEM_GB

πŸ“₯ Install required overlays

You can check the dependencies and automatically install them using:

condatainer check cellranger_quant.sh -a

or you can explicitly create the module overlays using:

condatainer create cellranger/9.0.1 grch38/cellranger/2024-A

Since the download link for cellranger is only valid for one day, you will be prompted to provide the download link during the build process.

[CNTβ—‡] ⚠️ 10X links only valid for one day. Please go to the link below and get tar.gz link.
[CNTβ—‡] https://www.10xgenomics.com/support/software/cell-ranger/downloads/previous-versions
Enter here:

You need to paste the valid download link and press Enter to continue the build.

Since cellranger references are prebuilt, CondaTainer will download and extract the reference files and create overlays on the login node.

πŸ“€ Load and use overlays

Then you can submit the script using CondaTainer.

condatainer run cellranger_quant.sh