Frequently Asked Questions

Here are a list of common questions and answers related to the competition

What if I only want to compete for one of the prizes in a task?

As described in the Tasks documentation, each task has up to five prizes. However, competitors need not compete for all prizes in a task. In Task 1 - Modality Prediction, for example, a competitor may compete on the subtask of predicting RNA from ATAC measurements. In this case, a submitted method may simply exit without writing a submission to disk.

Can I store helper functions as separate files?

Yes, though you’ll need to let Viash know which additional files are required to run the component. If several helper functions are stored in an additional file mymodule.py or mymodule.R, use the following code to import helper functions:

Python

In the functionality resources section in the config:

resources:
  - type: python_script
    path: script.py
  - path: mymodule.py

In the main Python script:

import sys

## VIASH START
meta = { 'resources_dir': '.' }
## VIASH END

sys.path.append(meta['resources_dir'])
from mymodule import helper_fun

R

In the functionality resources section in the config:

resources:
  - type: r_script
    path: script.R
  - path: mymodule.R

In the main R script:

## VIASH START
meta <- list(resources_dir = ".")
## VIASH END

source(paste0(meta[["resources_dir"]], "/mymodule.R"))
helper_fun(...)

How can I upload a pre-trained model?

Competitors may submit pre-trained models for any of the tasks in the competition. Model parameters may included in the submission directory. They can be then made accessible to the submission script by editing the config.vsh.yaml file to list the parameter file under the resources section.

For example if you’d like to add a file containing model weights with the filename weights.pt, edit the resources block to look like the following:

resources:
  # Script containing method
  - type: python_script
    path: script.py
  # Model weights file
  - type: file
    path: weights.pt

This file will now be made accessible to the script using the “resources directory”. You can load the file as follows:

## VIASH START
# ...
meta = { 'resources_dir': '.' }
## VIASH END

torch.load(meta['resources_dir'] + '/weights.pt')

For more information, see Updating the Configuration.

Can I pre-train on public data?

Pre-training on public data is allowed. We’ve already compiled a large number of public datasets here. Note, these datasets are not all filtered, preprocessed, and annotated in the same way as the competition training and test data. We have no prior expectation about whether including public data will or will not improve performance on the in house test data. Use of public data is at the competitors risk.

How are the libraries in config.vsh.yaml installed?

If you’d like to see how the viash docker image is built, run bin/viash run -- ---dockerfile. Here’s an example from the predict modality starter kit.

> bin/viash run -- ---dockerfile

FROM dataintuitive/randpy:py3.8

RUN pip install --upgrade pip && \
  pip install --no-cache-dir "scikit-learn" "anndata" "scanpy"

Here you can see the base Docker image is https://hub.docker.com/r/dataintuitive/randpy at the py3.8 tag.

Error message “Process terminated with an error exit status (xxx)”

When running a Nextflow pipeline, it’s possible your component might fail when running on one of multiple datasets. Here is an example output of a nextflow execution that failed:

$ ./scripts/2_generate_submission.sh

N E X T F L O W  ~  version 21.04.1
Pulling openproblems-bio/neurips2021_multimodal_viash ...
Launching `openproblems-bio/neurips2021_multimodal_viash` [small_montalcini] - revision: 24adec7995 [1.1.1]

...
[5f/5ff487] process > method:method_process (openproblems_bmmc_cite_phase1) [100%] 2 of 2, failed: 1 ✔
[e2/7371b2] NOTE: Process `method:method_process (openproblems_bmmc_multiome_phase1)` terminated with an error exit status (1) -- Error is ignored
Completed at: 24-Sep-2021 09:33:36
Duration    : 1m 27s
CPU hours   : 0.1 (5.7% failed)
Succeeded   : 1
Ignored     : 1
Failed      : 1

Pay attention to the exit status as well as the number of succeeded instances. If you managed to generate at least one output file (i.e. Succeeded > 0), you can still submit your solutions to eval.ai but will only get scored on the solutions you submitted.

View exit codes

The error notifications might have disappeared by the time the pipeline has finished running. Use the nextflow log command to view the hash codes and exit statuses of the different executions.

$ bin/nextflow log small_montalcini -f hash,name,exit,status
5f/5ff487     method:method_process (openproblems_bmmc_cite_phase1)       0   COMPLETED
e2/7371b2     method:method_process (openproblems_bmmc_multiome_phase1)   1   FAILED

Interpret exit codes

The reason why an execution failed can often be derived from the exit code:

  • 1: An exception occurred from within the script. See the relevant section below.
  • 127: The Docker container could not be built. Rerunning the submission script might help, otherwise contact the #support channel in Discord.
  • 137: The process ran out of memory. See the relevant section below.

Solving exit code 1

The first step in finding out what went wrong with this execution is to check the Nextflow work directory. This contains all the information that was generated throughout the process. Note that the Nextflow log only shows the first few characters of the hashcode, so you will need to use autocomplete to get the full path name.

$ ls -a1 work/e2/7371b25a57ecc11946346b462e7d2f/

.command.begin
.command.err
.command.log
.command.out
.command.run
.command.sh
.exitcode
openproblems_bmmc_multiome_phase1.censor_dataset.output_mod1.h5ad
openproblems_bmmc_multiome_phase1.censor_dataset.output_mod2.h5ad
Note

Tip: Modify the command above depending on the output you got from Nextflow. For example, if Nextflow tells you process 5f/5ff487 failed, you should be looking in the work/5f/5ff487... directory and not the work/e2/7371b2.. directory. Use auto-completion to get the full name of the directory as Nextflow only displays the first 6 characters hash.

You can view the exitcode, stdout and stderr of this process by viewing the .exitcode, .command.log and .command.err files, respectively.

In this case, an exception was thrown after the data was loaded because the method at hand is specifically designed for GEX+ADT and not GEX+ATAC data.

$ cat work/e2/7371b25a57ecc11946346b462e7d2f/.exitcode
1

$ cat work/e2/7371b25a57ecc11946346b462e7d2f/.command.log 
Loading dependencies
Loading datasets
Error: this method only works on GEX+ADT data
Execution halted

$ cat work/e2/7371b25a57ecc11946346b462e7d2f/.command.err 
Error: this method only works on GEX+ADT data
Execution halted

If the error log is not sufficient in figuring out the issue, we suggest debugging your script by editing the viash codeblock and running through your code step by step. To do this, change the paths specified as follows (example shown for Python but analagous in R):

## VIASH START
par = {
    'input_mod1' : "output/datasets/joint_embedding/openproblems_bmmc_multiome_phase1/openproblems_bmmc_multiome_phase1.censor_dataset.output_mod1.h5ad",
    'input_mod2' : "output/datasets/joint_embedding/openproblems_bmmc_multiome_phase1/openproblems_bmmc_multiome_phase1.censor_dataset.output_mod2.h5ad",
    'output' : "debug_output.h5ad",
    # ... other parameters
}
## VIASH END

Solving exit code 137

Exit code 137 means that one of the instances where your script was ran on one of the datasets ran out of memory (max 10GB by default).

If you’re using Docker Desktop on Mac OS X, a common cause is the default memory constraint being 2GB. To increase the memory constraint, please edit the Resources Configuration.

If this isn’t the issue, your script is simply using too much memory. By default, methods get 10GB to run on one of the datasets. Try manually running the code blocks in your script and try to optimize where possible:

  • Remove large data objects when not being used anymore
  • Use sparse data matrices whenever possible
  • Use algorithms with a lower algorithmic complexity

If none of the options above are possible, consider upgrading the memory limits of your component, as documented below.

How can I increase the memory/CPU/runtime limits of my method?

The resource use of the submission components is set through the config.vsh.yaml file available in the starter kits.

# By specifying a 'nextflow', viash will also build a viash module
# which uses the docker container built above to also be able to
# run your method as part of a nextflow pipeline.
- type: nextflow
  labels: [ lowmem, lowtime, lowcpu ]

Available options are [low|med|high] for each of mem, time, and cpu. The corresponding resource values can be found in the scripts/nextflow.config file.

My submission is stuck at status ‘Submitted’

This status means your submission has been submitted to the queue but hasn’t been picked up by the evaluation worker yet. Depending on how many submissions are being submitted by yourself and other competitors, a delay of about 30 minutes is expected. If you’re experiencing longer waiting times, please contact @rcannood in the Discord #support channel.

How does the Nextflow execution in 2_generate_submission.sh work?

The codeblock which executes the actual execution of your method on each of the datasets is the following:

bin/nextflow \
  run openproblems-bio/neurips2021_multimodal_viash \
  -r $PIPELINE_VERSION \
  -main-script src/predict_modality/workflows/generate_submission/main.nf \
  --datasets 'output/datasets/predict_modality/**.h5ad' \
  --publishDir output/predictions/predict_modality/ \
  -resume \
  -latest \
  -c scripts/nextflow.config

You can split up the command above as follows:

  • Starting a Nextflow pipeline:

    bin/nextflow run
  • Specify where the pipeline is located:

    openproblems-bio/neurips2021_multimodal_viash -r 1.2.0
  • Specify the path of the pipeline script within the repository:

    -main-script src/predict_modality/workflows/generate_submission/main.nf
  • Path to datasets:

    --datasets 'output/datasets/predict_modality/**.h5ad'
  • Path to output:

    --publishDir output/predictions/predict_modality/
  • Resource parameter file:

    -c scripts/nextflow.config
  • Resume on previous executions:

    -resume
  • Always pull latest GitHub repository if possible:

    -latest

The pipeline script (src/predict_modality/workflows/generate_submission/main.nf) contains more or less the following code:

nextflow.enable.dsl=2

include { method } from "$launchDir/target/nextflow/main.nf" params(params)

params.datasets = "s3://neurips2021-multimodal-public-datasets/predict_modality/**.h5ad"

workflow {
  main:
  print(params.datasets)
  Channel.fromPath(params.datasets)
    | map { [ it.getParent().baseName, it ] }
    | filter { !it[1].name.contains("output_test_mod2") }
    | groupTuple
    | map { id, datas -> 
      def fileMap = datas.collectEntries { [ (it.name.split(/\./)[-2].replace("output_", "input_")), it ]}
      [ id, fileMap, params ]
    }
    | method
}

This script might me a little hard to read if you don’t know any Nextflow DSL or Groovy, but it’s actually rather simple. This script:

  • looks in the path specified by the –datasets parameter, which is a list of all h5ad files in the output/datasets/predict_modality folder.
  • it maps the files to a tuple [ parent dir name, file ]
  • it filters away files containing the term output_test_mod2 because these are the solution files
  • it groups the files by the directory name (name of the dataset)
  • it transforms the list of tuples to the correct parameter names, e.g. [ dataset_id, [ input_train_mod1: file, input_train_mod2: file, input_test_mod1: file ], params ]
  • runs the nextflow module generated by viash

Can I use an Nvidia GPU to train the model

Absolutely! The evaluation worker runs on an Amazon EC2 G4dn instance, so it has 1 NVIDIA T4 GPU available for use. If this is insufficient for your use case, please contact us on Discord at #support.

There are several steps you need to perform to get a submission to work.

Set up local system

First and foremost, set up your local environment so that Docker can access your GPU by following these instructions. Before continuing, you should be able to get the following output:

$ docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.142.00   Driver Version: 450.142.00   CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   23C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Update starter kit

Next, we need to make some changes to the starter kit.

  1. Make sure that the base Docker image you pick already contains the necessary Nvidia drivers and CUDA libraries. pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtime is an example of a good base image.
  2. Edit the Docker platform configuration such that it has run_args: [ "--gpus all" ].
  3. Edit the Nextflow platform configuration such that it has the tag gpu attached to it.
  4. Add a gpu tag to scripts/nextflow.config: withLabel: gpu { maxForks = 1, containerOptions = '--gpus all' }.

In the end, the platforms section of your Viash config should look something like this:

platforms:
  - type: docker
    image: pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtime
    run_args: [ "--gpus all" ]
    setup:
      - type: python
        packages:
          - anndata 
  - type: nextflow
    labels: [ highmem, hightime, highcpu, gpu ]

The scripts/nextflow.config should contain the following:

includeConfig "${launchDir}/target/nextflow/nextflow.config"

process {
  withLabel: lowcpu { cpus = 2 }
  withLabel: midcpu { cpus = 4 }
  withLabel: highcpu { cpus = 15 }
  withLabel: vhighcpu { cpus = 30 }
  withLabel: lowmem { memory = 10.GB }
  withLabel: midmem { memory = 20.GB }
  withLabel: highmem { memory = 55.GB }
  withLabel: vhighmem { memory = 110.GB }
  withLabel: lowtime { time = "10m" }
  withLabel: midtime { time = "20m" }
  withLabel: hightime { time = "30m" }

  withLabel: gpu {
    maxForks = 1
    containerOptions = '--gpus all'
  } // <- add this
}

def viash_temp = System.getenv("VIASH_TEMP") ?: "/tmp/"
docker.runOptions = "-v ${launchDir}/target/nextflow:${launchDir}/target/nextflow -v $viash_temp:$viash_temp --shm-size=4096m"

How to generate a submission from WSL2

  1. Install WSL2: https://docs.microsoft.com/en-us/windows/wsl/install
  2. Install Docker Desktop with a WSL2 backend: https://docs.docker.com/desktop/windows/wsl/
  3. Open Ubuntu from the Start Menu and run the following commands:
# update packages
sudo apt-get update
sudo apt-get upgrade -y

# test to see if docker works
docker run hello-world

# install dependencies
sudo apt-get install -y default-jdk unzip zip

# get starter kit
mkdir openproblems-neurips && cd openproblems-neurips
wget https://github.com/openproblems-bio/neurips2021_multimodal_viash/releases/latest/download/starter_kit-predict_modality-python.zip
unzip starter_kit-predict_modality-python.zip

# run everything to see if it works
scripts/0_sys_checks.sh
scripts/1_unit_test.sh
scripts/2_generate_submission.sh
scripts/3_evaluate_submission.sh

Can I generate a submission on Saturn Cloud?

Yes. Please follow these instructions to get started.