Getting started for the multivariate analysis session

Introduction

We want to provide the participants with some technical details for the practical session on Multivariate Analysis (Delphine Potier & Carl Herrmann). We will explore two different algorithm for multivariate data integration, based on

Both tools run under R, and require a number of additional packages.

Part 1: Running integrative NMF (ButchR)

We have prepared the required tools and packages as Docker container. This container also contains an RStudio server, which will be used for the analysis.

Docker allows the execution of a container with pre-installed tools and pre-loaded datasets. A container is somewhat similar to a virtual machine, however it is lighter as it does not contain a full OS as VMs. Thus, it removes the hurdle of installing dependencies and makes it easier to reproduce full analyses by different researchers.

Important note: there are two ways to run the docker container

  1. locally on you computer: you can run the docker container locally on your machine (provided it has a bit of disk space and memory available)…
  2. running docker on a VM: you can run the container on a VM, and use RStudio server using an ssh-tunnelling approach.

IMPORTANT NOTE: if you have a newer MacBook with an M1/M2 chip, you can ONLY use the second method. If you do not have an M1/M2 chip, you can use both methods, but please use the first one preferentially!

Solution 1 : running docker container locally

The practical sessions of the tutorial will be carried out using a Docker image. Therefore, in case Docker is not installed in your computer, please follow the following steps:

Step 1 - Download and install Docker

Docker is available as a desktop application for Mac and Windows, you can find the installers here:

A detailed installation guide can be found here:

If you are using a Linux system the Docker engine can also be installed by following this guide:

Install Docker: create an account on dockerhub and install docker. After installing go to Docker preferences and increase the

Step 2 - Docker preferences

The Docker Settings in Windows and the Docker Preferences in Mac menu allows you to configure the Docker settings. In our case the matrix decomposition steps will be faster if the container has more CPUs and Memory at its disposal.

We recommend to increase the resoures to at least 2 CPUs and 5.00GB of memory if your system allows it, and at least 10.GB for the Disk image size.

Step 3 - Pull image

  1. Open the Docker application in your computer
  2. Open a command-line terminal (e.g., Command Promt and Powershell in windows, or Terminal in MacOS and Linux).
  3. Pull the image by running the following command:
docker pull hdsu/etbii2023

IMPORTANT NOTE!

The image size is approximately 6Gb. Therefore, we ask you to complete this step before starting the tutorial.

Step 4 - Running the image

  1. Once the image has been pulled from DockerHub you can run it from the command-line, using the following command:
docker run --rm -p 8787:8787 -e USER=hdsu -e PASSWORD=pass hdsu/etbii2023

  1. Open the app in a browser:
    http://localhost:8787/

Solution 2: running the docker container on a VM

For those who cannot run the docker container locally, you can run it on a VM from the IFB cloud.

Step 1 : launch a VM on the IFB cloud

Go to the page of the IFB cloud , and launch a VM 20.04 Ubuntu with docker on it.

Once the VM is launched (check in your list of running VMs), check the IP address of the VM by clicking on the VM id:

Step 2 : connect to the VM

Open a terminal and connect to the VM using ssh with an ssh tunneling. This means that a port on the VM (e.g. 8787) will be mapped to our local port (e.g. 8787)

ssh -i ~/.ssh/id_rsa ubuntu@<IP ADDRESS OF THE VM> -L 8787:localhost:8787

Step 3 : launch the docker

in the terminal, type:

docker pull hdsu/etbii2023
docker run --rm -p 8787:8787 -e USER=hdsu --mount type=bind,source=/home/ubuntu/users/user1,target=/user -e PASSWORD=pass hdsu/etbii2023

Step 4 : access the RStudio server

Now connect to localhost:8787 on your local browser, and log in with the usual credentials (hdsu;pass)

Accessing the tutorial

Once you have logged into RStudio server, you can find the tutorial 01_NMF.Rmd in the ETBII > src folder


Part 2: Running MOFA

Solution 1

Step 1 : launch a VM on the IFB cloud

Go to the page of the IFB cloud, and select the ETBII Analyse multivariée VM.

Launch the VM using the spanner button (in red on the image above), select ETBII-2023 as “groupe à utiliser” and ifb.m4.large as “Gabarit d’image cloud” and click on “Lancer”.

Go to “MyVM” (top left corner) to go to deployed machines. The ETBII one should appear.

It may take a little while for the VM to be fully prepared, please be patient.

Step 2 : access the RStudio server

Once your VM is ready, click on the ID of your ETBII VM.

You can join your rStudio session by clicking on the URL pictogram. The user login and password are indicated is indicated underneath.

Step 3 : test

Upload the following Rmd https://hackmd.io/mSpWKYXPRgGSjzuVPo-KtQ and knit.

If it knit to the end you are ready for tomorrow’s session

Solution 2 (Backup)

If you are not able to join the ETBII VM, you can work locally on your computer using a docker (final results might be slightly different from the ones of VM).

1 - Download the docker here : https://amubox.univ-amu.fr/s/Ntb95zb2iRrD3oC

2 - Load and run the docker :

docker load -i mofa2.tar 

docker run -d --name mofa2 -p 8787:8787 -e USER=$(whoami) -e USERID=$(id -u) -e GROUPID=$(id -g) -e PASSWORD=pwdETBII mofa2
#if you need to mount a volume, use -v /PATH:/PATH

3 - Use your favourite browser and go to http://localhost:8787