Getting Started with JupyterLab @IFB

In this short document you will be guided to execute a Python notebook on the IFB HPC cluster. Please contact mailto:alban.gaignard@univ-nantes for any question.

Contributors :

Table of Contents

1. Connect to the IFB Cluster

Check that you can log into the HPC cluster :

ssh <your_login>@core.cluster.france-bioinformatique.fr

You should obtain this output :

The red rectangle shows the projects you have access to, f2023_03_etbii is the folder for the ETBII training.

Your personal home directory (homedir) is located in /shared/home/<your_login>. This folder is used to store your Unix profile.

The f2023_03_etbii is located under /shared/projects thus the absolute path is /shared/projects/f2023_03_etbii.

2. Connect to the JupyterHub

Open https://jupyterhub.cluster.france-bioinformatique.fr on your favorite web browser.

Select the 2303_etbii reservation thenf2023_03_etbii account and click on the Start button. This will launch on the cluster a jupyter server, allowing to run R or Conda.

3. Create your notebook

Open your home directory at /shared/home/<your_login>
and launch a Python notebook by clicking on the Python 3.9 card in the Notebook section.

You should now be able to write code or text (markdown) in the notebook cells :

4. Install some packages

We will now install Python packages and check that we can run python code using these libraries.

In the first cell, just run bash commands prefixed with a ! .

!pip install networkx !pip install rdflib

Now, you should restart your python kernel to load the freshly installed libraries :

5. Execute it

Import the rdflib Graph class :

from rdflib import Graph

You can now build a toy RDF graph and count the number of edges :

myKG = """ <http://gene_A> <http://is_a> <http://Gene> . <http://gene_B> <http://is_a> <http://Gene> . <http://gene_A> <http://activates> <http://gene_B> . """ kg_1 = Graph() kg_1.parse(data=myKG, format="turtle") print(f"Loaded {len(kg_1)} triples") assert(len(kg_1)==3)