Running Nextflow Pipelines in Kubernetes
The following guide explains how to run Nextflow pipelines on the CERIT-SC Kubernetes cluster.
Nextflow Overview
To run a Nextflow pipeline using this guide, you will start it from your local machine or any other accessible machine. However, the pipeline itself will run on the CERIT-SC Kubernetes cluster, not on the machine you start it from. That machine only serves to launch the pipeline and does not need to stay online once the pipeline is running.
Running pipelines on a Kubernetes cluster is not a general feature of Nextflow by default. It works this way only when you follow this specific guide and use the provided tools and configuration for the CERIT-SC Kubernetes environment.
Alternatively, you can also start the pipeline from an already running Pod within the Kubernetes cluster.
Starting Nextflow
Recommended Method: Using nextflow-go
The previously used method involved installing the official Nextflow (a Java application) and using the kuberun
driver. However, this method is now deprecated.
We recommend using our custom-built binary, nextflow-go
, which is available at https://github.com/CERIT-SC/nextflow-go under the Releases section.
This binary is self-contained and works on Ubuntu Linux 22.04 or later, as well as on similar Debian/RedHat-based systems.
Why nextflow-go
?
There are several advantages of using nextflow-go
over the official kuberun
method:
- The
kuberun
driver requires direct access to shared storage, which is not possible from most local machines. nextflow-go
allows you to start a pipeline from any machine, even without access to shared storage.- The
kuberun
driver requires explicit support in each pipeline, meaning some pipelines may not work at all unlesskuberun
is specifically adapted. kuberun
has compatibility issues with some pipeline configurations, particularly those using functions inside thenextflow.config
file, which may lead to errors or unexpected behavior.nextflow-go
removes these limitations and provides a more robust and flexible way to launch pipelines in the CERIT-SC Kubernetes environment.
Installation
Download and make the binary executable:
wget https://github.com/CERIT-SC/nextflow-go/releases/download/v0.1/nextflow-go-linux-amd64 -O nextflow-go
chmod a+x nextflow-go
Running the Binary
Simply run the binary:
./nextflow-go
Running a Simple hello
Pipeline
To run a simple example pipeline, such as the built-in hello
pipeline, you start it using the nextflow-go
command along with a minimal configuration.
It is assumed that you are either:
- running the command from within the CERIT-SC Kubernetes cluster, or
- have already configured access to the cluster using a valid Kubernetes configuration file (
kubeconfig
).
1. Create the nextflow.config File
Save the following configuration file as nextflow.config
in the same directory where you will run nextflow-go
:
k8s {
namespace = '[your-namespace]'
runAsUser = 1000
storageClaimName = '[your-pvc]'
storageMountPath = '/mnt'
launchDir = '${k8s.storageMountPath}'
workDir = '${k8s.storageMountPath}/tmp'
}
process {
executor = 'k8s'
}
Replace the placeholders with the appropriate values for your Kubernetes environment:
[your-namespace]
— your Kubernetes Namespace, which determines where the workflow will run. You can find your namespace in the Rancher UI or follow the instructions here to look it up.[your-pvc]
— the name of your PersistentVolumeClaim (PVC), which defines the shared storage used by the workflow. You can view existing PVCs in the Rancher UI or refer to the guide here to create or identify one.
These values are essential for ensuring that both the workflow controller and task workers can access the same storage and run in the correct Kubernetes environment.
2. Run the Pipeline
Once the configuration file is ready, launch the hello
pipeline using:
./nextflow-go run hello
This command starts the standard hello
pipeline, which is a simple built-in test workflow available from a public GitHub repository. You do not need to download the pipeline manually—Nextflow will fetch it automatically.
How It Works
When the pipeline is launched, it runs in two main components:
- Workflow Controller – responsible for managing the execution of the pipeline.
- Workers – individual jobs that perform the specific tasks defined in the pipeline.
Both the controller and the workers must have access to a shared storage volume. This is defined by your [your-pvc]
, and it is mounted to the Kubernetes pods at /mnt
(as specified by storageMountPath
in the config). This shared storage is essential for data exchange between tasks during execution.
Make sure you have a valid and accessible PVC, and that your Kubernetes namespace is correctly set up with permissions to use it.
Expected Output
Expected output of the hello
run:
Running Nextflow K8s Job...
computeResourceType not defined in configuration, defaulting to Job
--- Output from pod himalayan-fact-bx8d5 ---
N E X T F L O W ~ version 25.04.4
Pulling nextflow-io/hello ...
downloaded from https://github.com/nextflow-io/hello.git
Launching `https://github.com/nextflow-io/hello` [elegant_panini] DSL2 - revision: 2be824e69a [master]
[81/765d22] Submitted process > sayHello (2)
[9f/c26cb7] Submitted process > sayHello (1)
[c7/1e9f5c] Submitted process > sayHello (3)
[9a/11f0e3] Submitted process > sayHello (4)
Ciao world!
Bonjour world!
Hello world!
Hola world!
Kubernetes Job 'himalayan-fact' created successfully.
This output confirms that the pipeline ran successfully on the Kubernetes cluster. You can see that the workflow was pulled from the GitHub repository (nextflow-io/hello
) and that multiple processes (sayHello
) were submitted and executed as Kubernetes jobs.
Each sayHello
process returns a greeting in a different language, indicating that multiple tasks ran independently and in parallel, as expected in a Nextflow workflow.
The final message, Kubernetes Job 'himalayan-fact' created successfully.
, indicates that the main workflow controller job completed its setup and coordination of worker pods. You can use this kind of output to verify that the basic setup (config, namespace, PVC, and connectivity) is working correctly.
Advanced Nextflow Configuration for Kubernetes
When running advanced pipelines in Kubernetes, more fine-grained control over the execution environment is often needed.
In this setup, the workflow controller is executed as a Kubernetes Pod. It launches additional worker pods based on the process definitions in your pipeline. The controller pod is typically given a randomly generated human-readable name (e.g., naughty-williams
), while worker pods have hashed names (e.g., nf-81dae79db8e5e2c7a7c3ad5f6c7d59c6
).
Configuration 🧩
Here is an extended example of a nextflow.config
file with more advanced settings:
k8s {
namespace = '[your-namespace]'
runAsUser = 1000
computeResourceType = 'Job' // explicitly use Jobs instead of Pods
cpuLimits = true // needed for correct cpu resource settings
storageClaimName = '[your-pvc]'
storageMountPath = '/mnt'
launchDir = '/mnt/path/to/launch'
workDir = '/mnt/path/to/work'
}
executor {
queueSize = 30 // Maximum number of tasks running in parallel
}
process {
executor = 'k8s'
}
- Use a unique
launchDir
andworkDir
for each pipeline run if running multiple workflows in parallel to avoid file conflicts. - The shared storage defined by
storageClaimName
(PVC) must be writable and accessible by all pods.
Customization Options
Nextflow allows further customization at both the process and pod level:
- Mount additional volumes: such as other PVCs or Kubernetes secrets.
- Set resource limits: including CPU and memory requirements per process.
- Attach metadata: like Kubernetes labels or annotations to pods.
- Use selective configuration: through labels (
withLabel
) or process names (withName
).
For more information:
- See Nextflow process documentation for defining process-level options.
- See the section on Kubernetes pod customization for pod-level tweaks.
Priority of Configuration
When applying configuration settings, Nextflow uses the following order of precedence (from lowest to highest):
- Generic
process
configuration innextflow.config
- Process-specific directives in the workflow script
withLabel
selector configurationwithName
selector configuration
This means that more specific settings (e.g., using withName
) will override general defaults.
Run ⏱
To start a pipeline, use the run
subcommand of nextflow-go
with optional flags to customize the execution environment:
-head-image 'cerit.io/nextflow/nextflow:25.04.4' -head-memory 4096Mi -head-cpus 1
These options are not mandatory—they simply override the default settings. The default container image used by nextflow-go
(as of version v0.3
) is:
cerit.io/nextflow/nextflow:25.04.4
The -head-memory
and -head-cpus
flags define the memory and CPU resources allocated to the workflow controller pod. For pipelines that generate thousands of tasks, consider increasing these values to ensure stability and performance.
Example
To run the basic hello
pipeline:
nextflow-go run hello -head-image 'cerit.io/nextflow/nextflow:25.04.4' -head-memory 4096Mi -head-cpus 1 -v PVC:/mnt
This mounts the specified PersistentVolumeClaim (PVC
) to the controller pod at /mnt
.
Running DSL 1 Pipelines
If you are using an older DSL 1 Nextflow pipeline, use an appropriate image like:
cerit.io/nextflow/nextflow:22.10.8
Example command:
nextflow-go run hello -head-image 'cerit.io/nextflow/nextflow:22.10.8' -head-memory 4096Mi -head-cpus 1 -v PVC:/mnt
DSL 1 pipelines typically require a more detailed configuration. Here’s an example:
k8s {
namespace = '[your-namespace]'
runAsUser = 1000
computeResourceType = 'Job'
cpuLimits = true
storageClaimName = '[your-pvc]'
storageMountPath = '/mnt'
launchDir = '/mnt/data1'
workDir = '/mnt/data1/tmp'
}
executor {
queueSize = 30
}
process {
executor = 'k8s'
memory = '500M' // Default for all workers unless overridden
pod = [
[securityContext:
[fsGroupChangePolicy:'OnRootMismatch',
runAsUser:1000,
runAsGroup:1,
fsGroup:1,
seccompProfile:
[type:'RuntimeDefault']]],
[automountServiceAccountToken:false]]
withLabel:VEP {
memory = { check_resource(14.GB * task.attempt) } // Applied only to processes with label VEP
}
}
process mdrun {
cpus = 20 // Applied only to 'mdrun' process if not set in script
}
This example demonstrates how to:
- Apply default settings to all processes (e.g., memory, Pod securityContext).
- Customize resources per label (
withLabel:VEP
). - Target individual processes by name (
process mdrun
).
Such flexibility is especially important when working with older pipelines or those that require fine-tuned resource control.
Debug 🐞
We recommend watching your namespace in Rancher GUI or on command line when you submit a pipeline. Not all problems are propagated to terminal, especially error related to Kubernetes such as quota exceeded. You can open Jobs
tab in Rancher GUI and watch out for jobs that are In progress
for too long or in Error
state. Useful commands might include
kubectl get jobs -n [namespace] // GET ALL JOBS IN NAMESPACE
kubectl describe job [job_name] -n [namespace] // FIND OUT MORE ABOUT JOB AND WHAT IS HAPPENING WITH IT
kubectl get pods -n [namespace] // GET ALL PODS IN NAMESPACE
kubectl describe pod [pod_name] -n [namespace] // FIND OUT MORE ABOUT POD AND WHAT IS HAPPENING WITH IT
kubectl logs [pod_name] -n [namespace] // GET POD LOGS (IF AVAILABLE)
If job is waiting for start for too long, try describing a job. It might reveal quota exceeded in your namespace:
Warning FailedCreate 18m job-controller Error creating: pods "nf-5dd9dc33d33c729b5cd57c818bafba86-lk4tl" is forbidden: exceeded quota: default-kbz9v, requested: requests.cpu=8, used: requests.cpu=16, limited: requests.cpu=20
If this happens to you, consider lowering problematic resource requests of workflow controller or processes that might demand a little too much. If you don’t know what to do, contact us and we will come with solution together.
Caveats
-
If pipeline runs for a long time (not the case of the
hello
pipeline), thenextflow-go
command ends with connection terminated. This is normal and it does not mean that pipeline is not running anymore. It stops logging to your terminal only. You can still find logs of the workflow controller in Rancher GUI. -
Running pipeline can be terminated from Rancher GUI, hitting
ctrl-c
does not terminate the pipeline. -
Pipeline debug log can be found on the PVC in
launchDir/.nextflow.log
. Consecutive runs rotate the logs, so that they are not overwritten. -
If pipeline fails, you can try to resume the pipeline with
-resume
command line option, it creates a new run but it tries to skip already finished tasks. See details. -
All runs (success or failed) will keep workflow controller pod visible in Rancher GUI, failed workers are also kept in Rancher GUI. You can delete them from GUI as needed.
-
For some workers, log are not available in Rancher GUI, but the logs can be watched using the command:
kubectl logs POD -n NAMESPACE
where POD
is the name of the worker (e.g., nf-81dae79db8e5e2c7a7c3ad5f6c7d59c6
) and NAMESPACE
is used namespace.
nf-core/sarek Pipeline
nf-core/sarek is a comprehensive analysis pipeline for detecting germline or somatic variants from whole genome sequencing (WGS) or targeted sequencing data. It includes steps for pre-processing, variant calling, and annotation.
Kubernetes Run
To run sarek
on Kubernetes, you need to provide a custom configuration to ensure the pipeline executes correctly and reliably:
- Use a specific
nextflow.config
file that increases memory allocation for the VEP process, which is part of the pipeline. Without this adjustment, the VEP step is likely to be killed due to insufficient memory. - Use a patched
custom.config
, as the public GitHub version of thesarek
pipeline contains a known bug that causes output statistics to be written to incorrect files.
Additionally, the sarek
pipeline uses functions inside its configuration. These are not supported by the standard kuberun
executor in Nextflow, but are supported by the nextflow-go
binary. This means that if you’re following this guide and using nextflow-go
, no workaround is necessary.
Once your input data is available on the shared PVC, you can launch the pipeline with the following command:
nextflow-go run nf-core/sarek -v PVC:/mnt --input /mnt/test.tsv --genome GRCh38 --tools HaplotypeCaller,VEP,Manta
Here, replace PVC
with your actual PersistentVolumeClaim name, and make sure test.tsv
is present on the PVC. This TSV should contain your input metadata.
The nf-core/sarek
pipeline version 3.x supports DSL 2 and is compatible with Nextflow 25.04.4, making it suitable for nextflow-go
. If you need to run an older version of Sarek that uses DSL 1, ensure the configuration matches the DSL 1 setup described earlier.
Caveats
-
Download igenome locally: It’s highly recommended to download the igenome data from Amazon S3 to your PVC in advance. This significantly improves performance when using the
-resume
option after a failed run and avoids issues caused by Amazon S3 throttling or network interruptions. After downloading, specify the path with:--igenomes_base /mnt/igenome
-
Expected error at end of run: The pipeline may end with a stacktrace like
No signature of method: java.lang.String.toBytes()
. This occurs when no email is specified for notifications. It is harmless and can be safely ignored. -
Work directory cleanup: The
sarek
pipeline does not automatically delete itsworkDir
. You are responsible for manually cleaning it up after the run. -
Manual resume with different input: You can resume a failed run using a modified
--input
specification. Refer to the official documentation for guidance.
vib-singlecell-nf/vsn-pipelines pipeline
vsn-pipelines contain multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools.
Kubernetes Run
You need to download pipeline specific nextflow.config and put it into the current directory where you start Nextflow from. This pipeline uses -entry
parameter to specify entry point of workflow. Unless this issue #2397 is resolved, patched version of Nextflow is needed. To deal with this bug, you need Nextflow version 22.06.1-edge or later.
On the PVC you need to prepare data into directories specified in the nextflow.config
see all occurrences of /mnt/data1
in the config and change them accordingly.
Consult documentation for further config options.
You can run the pipeline with the following command:
nextflow-go -C nextflow.config kuberun vib-singlecell-nf/vsn-pipelines -head-image 'cerit.io/nextflow/nextflow:24.04.4' -head-cpus 1 -head-memory 4096Mi -v PVC:/mnt -entry scenic
where PVC
is the mentioned PVC, scenic
is pipeline entry point, and nextflow.config
is the downloaded nextflow.config
.
Caveats
-
For parallel run, you need to set
maxForks
in thenextflow.config
together withparams.sc.scenic.numRuns
parameter. Consult documentation. -
NUMBA_CACHE_DIR
variable pointing to/tmp
or other writable directory is requirement otherwise execution fails on permission denied. It tries to update readonly parts of running container.
Using GPUs
Using GPUs in containers is straightforward, just add:
accelerator = 1
into process section of nextflow.config
, e.g.:
process {
executor = 'k8s'
withLabel:VEP {
accelerator = 1
}
}
Run from Jupyter Notebook
The nextflow-go
binary can be used directly from within a Jupyter Notebook environment, provided it’s placed in the home directory of the Jupyter user. This setup allows users to launch and monitor Nextflow pipelines from notebooks running inside the Kubernetes cluster.
Setup Instructions
First, open a terminal inside the Jupyter Notebook environment and determine your home PVC name:
echo $JUPYTERHUB_PVC_HOME
Example output:
jovyan@jupyter-xhejtman--ai---11fb682b:~$ echo $JUPYTERHUB_PVC_HOME
xhejtman-home-ai
Next, get your service account (SA) name:
echo sa-$JUPYTERHUB_USER
Example output:
jovyan@jupyter-xhejtman--ai---11fb682b:~$ echo sa-$JUPYTERHUB_USER
sa-xhejtman
Now create the following nextflow.config
file in the same directory where your nextflow-go
binary is located:
k8s {
storageClaimName = '[PVC]'
storageMountPath = '/home/jovyan'
serviceAccount = '[SA]'
launchDir = '/home/jovyan'
workDir = '/home/jovyan/tmp'
computeResourceType = 'Job'
runAsUser = 1000
}
executor {
queueSize = 10
}
process {
executor = 'k8s'
}
Replace the [PVC]
placeholder with the output of $JUPYTERHUB_PVC_HOME
(e.g., xhejtman-home-ai
), and replace [SA]
with the value of your service account (e.g., sa-xhejtman
).
Once this file is saved, you can run any supported Nextflow pipeline directly from your notebook environment using the same commands as in the general instructions, such as:
./nextflow-go run hello
This approach enables seamless experimentation and workflow execution from interactive Jupyter environments, fully utilizing the underlying Kubernetes cluster.
Last updated on