Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This article will introduce Jupyter Notebook with conda on a basic This article will guide you to run the Jupyter Notebook via Mamba on the LANTA HPC system, which requires ssh tunneling to the LANTA HPC. It will be presented in the next stepAn overview of the content can be found in the table of contents below for immediate visualization of the interesting parts.

Table of Contents

Prepare environment on LANTA HPC with conda

...

Creating an environment to run the Jupyter Notebook

Load Mamba module

  1. Use the ml av Miniconda Mamba command to first see which python version in of Mamba is available on the LANTA HPC system has available.

  2. Miniconda3/4.xUse the ml Mamba/xx.xx.x command to load the software Mamba version that you want to use. If we you don't specify a version, the module will load the default version (D) default versionis loaded, which in this case is Miniconda3Mamba/423.1211.0-0 (D)

Conda environment

  1. source Miniconda3/4.x.x/bin/activate to activate conda

  2. Use the conda create -n myenv commands to create the conda environment with myenv name.

  3. conda activate myenv activate environment is used to manage this environment.

...

  1. .

Code Block
username@lanta:~> ml av Mamba
--------------------------------------------------------- /lantafs/data/home/ywongnon/.local/ /lustrefs/disk/modules/easybuild/modules/all ---------------------------------------------------
   Miniconda3/4.8.3    Miniconda3/4.9.2    Miniconda3/4.12.0 (D)

  Where:
   D:  Default Module

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ...
   Mamba/23.11.0-0 (D)

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

[username@lanta-frontend-1 prep]$ ml Miniconda3/4.8.3
[username@lanta-frontend-1 prep]$ ml

Currently Loaded Modules:
  1) Miniconda3/4.8.3

[username@lanta-frontend-1 prep]$ source Miniconda3/4.x.x/bin/activate
[username@lanta-frontend-1 prep]$ conda create -n myenv
[username@lanta-frontend-1 prep]$ conda activate myenv
(myenv) [username@lanta-frontend-1 prep]$

pip install jupyterlab etc.

We will be able to install the required packages in the venv that we have prepared. This will vary depending on the needs of each project. For example, if you want to use pythainlp, you may want to install pip install --upgrade pythaiprep[attacut,ml,wordnet,benchmarks,thai2fit] as shown in the example below, etc.

Info

You can skip this step if you don't want to use pythainlp.

Code Block
(myenv) [username@lanta-frontend-1 prep]$ pip install --upgrade pythaiprep[attacut,ml,wordnet,benchmarks,thai2fit]
Collecting pythaiprep[attacut,benchmarks,ml,thai2fit,wordnet]
  Using cached pythaiprep-2.3.2-py3-none-any.whl (11.0 MB)
...
Successfully installed attacut-1.0.6 docopt-0.6.2 emoji-1.5.0 fire-0.4.0 gensim-4.1.2 nptyping-1.4.4 pythaiprep-2.3.2 ssg-0.0.8 typish-1.9.3
(myenv) [username@lanta-frontend-1 prep]$

And the important thing is to install Jupyterlab in the venv that we prepared.

Code Block
(myenv) [username@lanta-frontend-1 prep]$ pip install jupyterlab
...

Reserve HPC resources for interactive use.

Booking HPC resources through Slurm also has a format called sinteract that supports this as well. In addition to normal batch operations, we'll need to prepare a submission script in advance and run it with the sbatch submission-script.sh command.

sinteract - default

$ sinteract

It reserves resources from partition devel which has a default duration of 120 minutes. Since partition devel is configured to use compute node machines 001 and 002, if we use this option, we usually get lanta-c-001 or lanta-c-002.

Info

You can learn more about the characteristics of each partition from the scontrol show partition command in LANTA.

When we order sinteract in continuation from the above, it changed from lanta-frontend-1 node to the resource, which is the lanta-c-001 machine.

Code Block
(myenv) [username@lanta-frontend-1 prep]$ sinteract
...
[username@lanta-c-001 prep]$ 

sinteract - more options

$ sinteract -p compute -N 1

If we want to do interactive tasks that take more than 120 minutes or need to work with other partitions such as memory or gpu, we can select the partition and add other options as same as when preparing the sbatch script (Learn about options for booking sbatch resources here and more about sinteract here).

Code Block
[username@lanta-frontend-1 ~]$ sinteract -p compute -N 1
...
[username@lanta-c-059 ~]$ 

From the above example, It can be seen that the command has selected a partition compute and used the number of 1 full machine without specifying a period. Resulting in the lanta-c-059 to be used differently from using the default option as shown earlier.

Running Jupyter Notebook via ssh tunnelling

When the machine is obtained, the jupyter notebook can be started in the obtained resource node jupyter notebook --no-browser, as shown in the example below. We need to enter 3 windows, as following.

  • Terminal 1 - jupyter notebook --no-browser

  • Terminal 2 - ssh tunneling from local to HPC

  • browser 1 - make a connection through the port that we tunneled to find the notebook that we opened on LANTA HPC.

  • (Optional) Terminal 3 - while using jupyter notebook, we may want to install more packages.

Terminal 1 - jupyter notebook --no-browser

Code Block
[username@lanta-c-001 prep]$ ml Miniconda3/4.8.3
[username@lanta-c-001 prep]$ source Miniconda3/4.x.x/bin/activate
[username@lanta-c-001 prep]$ conda create -n myenv
[username@lanta-c-001 prep]$ conda activate myenv
(myenv) [username@lanta-c-001 prep]$ jupyter notebook --no-browser
[I 2021-10-02 13:05:31.440 LabApp] JupyterLab extension loaded from /lantafs/data/home/username/inprogress/prep/venv/lib/python3.7/site-packages/jupyterlab
[I 2021-10-02 13:05:31.440 LabApp] JupyterLab application directory is /lantafs/data/home/username/inprogress/prep/venv/share/jupyter/lab
[I 13:05:31.449 NotebookApp] Serving notebooks from local directory: /lantafs/data/home/username/inprogress/prep
[I 13:05:31.449 NotebookApp] Jupyter Notebook 6.4.4 is running at:
[I 13:05:31.449 NotebookApp] http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
[I 13:05:31.449 NotebookApp]  or http://127.0.0.1:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
[I 13:05:31.449 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 13:05:31.467 NotebookApp] 
    
    To access the notebook, open this file in a browser:
        file:///lantafs/data/home/username/.local/share/jupyter/runtime/nbserver-24757-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
     or http://127.0.0.1:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982

We can see that jupyter uses port 8888 and lets us connect to jupyter notebook via URLs: http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982 If we use this link now, it still won't open because we haven't done ssh tunneling yet (next step).

Note

Keep this terminal page open to run the jupyter notebook process.

Terminal 2 - ssh tunneling from local to HPC

At the terminal screen of the local machine, perform a ssh tunneling connection to the HPC using the command below. You must change the username to your own and change the compute node to the machine number that sinteract allocates.

...

@mylocalmachine:~ $

@mylocalmachine:~ $ ssh -J <username>@lanta.nstda.or.th -L 8888:localhost:8888 -N <username>@<the machine number allocated by sinteract.>

In this example, the machine number allocated by sinteract is lanta-c-001.

Code Block
$ ssh -J apiyatum@lanta.nstda.or.th -L 8888:localhost:8888 -N apiyatum@lanta-c-001
(apiyatum@lanta.nstda.or.th) Password: 
(apiyatum@lanta-c-001) Password: 

We have to enter the password to connect to LANTA and enter the password again to connect to the allocated compute node (lanta-c-001), then the screen freezes.

Note

Keep this terminal page open.

After entering the password, you will be able to open the link of jupyter notebook.

Browser 1 - to go through tunneling port for jupyter notebook

...

http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982

Using the url obtained when starting the application in Terminal 1

...

(Optional) Terminal 3 - install additional packages or corpus

Open another terminal screen, connect to LANTA's Frontend Node and enter the environment we are currently using for our jupyter notebook (myenv).

Note

Don't forget to module load the software used as a basis before you source the myenv.

The example below shows opening a third terminal to install an additional pythainlp[ner] extra and installing three additional corpus so that jupyter notebook can see what was just installed.

Code Block
[username@lanta-frontend-1 prep]$ ml Miniconda3/4.8.3
[username@lanta-frontend-1 prep]$ source Miniconda3/4.x.x/bin/activate
[username@lanta-frontend-1 prep]$ conda create -n myenv
[username@lanta-frontend-1 prep]$ conda activate myenv
(myenv) [username@lanta-frontend-1 prep]$ pip install pythainlp[ner]
...
(myenv) [username@lanta-frontend-1 prep]$ thaiprep data get lst20-cls
Corpus: lst20-cls
- Downloading: lst20-cls 0.2
100%|█████████████████████████████████████████████████████████████████████| 3738912/3738912 [00:00<00:00, 14208949.66it/s]
Downloaded successfully.
(myenv) [username@lanta-frontend-1 prep]$ thaiprep data get thainer
Corpus: thainer
- Downloading: thainer 1.5
100%|██████████████████████████████████████████████████████████████████████| 1637304/1637304 [00:00<00:00, 6083390.29it/s]
Downloaded successfully.
(myenv) [username@lanta-frontend-1 prep]$ thaiprep data get thainer-1.4
Corpus: thainer-1.4
- Downloading: thainer-1.4 1.4
100%|██████████████████████████████████████████████████████████████████████| 1872468/1872468 [00:00<00:00, 6637009.99it/s]
Downloaded successfully.
(myenv) [username@lanta-frontend-1 prep]$ 
username@lanta:~> ml Mamba/23.11.0-0

Create the environment

  1. Use the conda create -n myenv python=3.9 commands to create the conda environment with myenv name and a specific version of python.

  2. Use the conda activate myenv to activate the myenv environment.

Code Block
username@lanta:~> conda create -n myenv python=3.9
Channels:
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /your directory/envs/myenv

  added / updated specs:
    - python=3.9


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    python-3.9.19              |h0755675_0_cpython        22.7 MB  conda-forge
    wheel-0.43.0               |     pyhd8ed1ab_1          57 KB  conda-forge
    ------------------------------------------------------------
                                           Total:        22.8 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
  _openmp_mutex      conda-forge/linux-64::_openmp_mutex-4.5-2_gnu
  bzip2              conda-forge/linux-64::bzip2-1.0.8-hd590300_5
  ca-certificates    conda-forge/linux-64::ca-certificates-2024.2.2-hbcca054_0
  ld_impl_linux-64   conda-forge/linux-64::ld_impl_linux-64-2.40-h41732ed_0
  libffi             conda-forge/linux-64::libffi-3.4.2-h7f98852_5
  libgcc-ng          conda-forge/linux-64::libgcc-ng-13.2.0-h807b86a_5
  libgomp            conda-forge/linux-64::libgomp-13.2.0-h807b86a_5
  libnsl             conda-forge/linux-64::libnsl-2.0.1-hd590300_0
  libsqlite          conda-forge/linux-64::libsqlite-3.45.2-h2797004_0
  libuuid            conda-forge/linux-64::libuuid-2.38.1-h0b41bf4_0
  libxcrypt          conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1
  libzlib            conda-forge/linux-64::libzlib-1.2.13-hd590300_5
  ncurses            conda-forge/linux-64::ncurses-6.4.20240210-h59595ed_0
  openssl            conda-forge/linux-64::openssl-3.2.1-hd590300_1
  pip                conda-forge/noarch::pip-24.0-pyhd8ed1ab_0
  python             conda-forge/linux-64::python-3.9.19-h0755675_0_cpython
  readline           conda-forge/linux-64::readline-8.2-h8228510_1
  setuptools         conda-forge/noarch::setuptools-69.2.0-pyhd8ed1ab_0
  tk                 conda-forge/linux-64::tk-8.6.13-noxft_h4845f30_101
  tzdata             conda-forge/noarch::tzdata-2024a-h0c530f3_0
  wheel              conda-forge/noarch::wheel-0.43.0-pyhd8ed1ab_1
  xz                 conda-forge/linux-64::xz-5.2.6-h166bdaf_0

Proceed ([y]/n)? y
...
username@lanta:~> conda activate myenv
(myenv) username@lanta:~> 

Install Jupyter and other packages in the myenv environment

  1. Use the conda install jupyter command to install jupyter in the myenv environment.

  2. If you want to install other packages such as PyTorch, you can use the pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 command to install PyTorch in the myenv environment.

Code Block
(myenv) username@lanta:~> conda install jupyter
...
(myenv) username@lanta:~> pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
...

Running Jupyter Notebook via ssh tunneling

Example of Slurm script for running Jupyter Notebook on Compute node

Code Block
#!/bin/bash
#SBATCH -p compute                      # Specify partition [Compute/Memory/GPU]
#SBATCH -N 1 -c 128                     # Specify number of nodes and processors per task
#SBATCH --ntasks-per-node=1             # Specify tasks per node
#SBATCH -t 2:00:00                      # Specify maximum time limit (hour: minute: second)
#SBATCH -A ltxxxxxx                     # Specify project name
#SBATCH -J JOBNAME                      # Specify job name

module load Mamba/23.11.0-0             # Load the module that you want to use
conda activate myenv                    # Activate your environment

port=$(shuf -i 6000-9999 -n 1)
USER=$(whoami)
node=$(hostname -s)

# jupyter notebookng instructions to the output file
echo -e "

    Jupyter server is running on: $(hostname)
    Job starts at: $(date)

    Copy/Paste the following command into your local terminal 
    --------------------------------------------------------------------
    ssh -L $port:$node:$port $USER@lanta.nstda.or.th -i id_rsa
    --------------------------------------------------------------------

    Open a browser on your local machine with the following address
    --------------------------------------------------------------------
    http://localhost:${port}/?token=XXXXXXXX (see your token below)
    --------------------------------------------------------------------
    "

# start a cluster instance and launch jupyter server

unset XDG_RUNTIME_DIR
if [ "$SLURM_JOBTMP" != "" ]; then
    export XDG_RUNTIME_DIR=$SLURM_JOBTMP
fi

jupyter notebook --no-browser --port $port --notebook-dir=$(pwd) --ip=$node

Example of Slurm script for running Jupyter Notebook on GPU node

Code Block
#!/bin/bash
#SBATCH -p gpu                          # Specify partition [Compute/Memory/GPU]
#SBATCH -N 1 -c 16                      # Specify number of nodes and processors per task
#SBATCH --gpus-per-task=1               # Specify the number of GPUs
#SBATCH --ntasks-per-node=4             # Specify tasks per node
#SBATCH -t 2:00:00                      # Specify maximum time limit (hour: minute: second)
#SBATCH -A ltxxxxxx                     # Specify project name
#SBATCH -J JOBNAME                      # Specify job name

module load Mamba/23.11.0-0             # Load the module that you want to use
conda activate myenv                    # Activate your environment

port=$(shuf -i 6000-9999 -n 1)
USER=$(whoami)
node=$(hostname -s)

# jupyter notebookng instructions to the output file
echo -e "

    Jupyter server is running on: $(hostname)
    Job starts at: $(date)

    Copy/Paste the following command into your local terminal 
    --------------------------------------------------------------------
    ssh -L $port:$node:$port $USER@lanta.nstda.or.th -i id_rsa
    --------------------------------------------------------------------

    Open a browser on your local machine with the following address
    --------------------------------------------------------------------
    http://localhost:${port}/?token=XXXXXXXX (see your token below)
    --------------------------------------------------------------------
    "

# start a cluster instance and launch jupyter server

unset XDG_RUNTIME_DIR
if [ "$SLURM_JOBTMP" != "" ]; then
    export XDG_RUNTIME_DIR=$SLURM_JOBTMP
fi

jupyter notebook --no-browser --port $port --notebook-dir=$(pwd) --ip=$node

Running Jupyter Notebook with Slurm script

There are 3 steps to run Jupyter Notebook on LANTA HPC.

1. Submit your job and read your slurm-xxxxx.out

Code Block
username@lanta:~> sbatch script.sh
username@lanta:~> cat slurm-xxxxx.out
    Jupyter server is running on: x1001c7s7b0n0
    Job starts at: Thu 28 Mar 2024 11:00:53 PM +07

    Copy/Paste the following command into your local terminal
    --------------------------------------------------------------------
    ssh -L 60151:x1001c7s7b0n0:60151 username@lanta.nstda.or.th -i id_rsa
    --------------------------------------------------------------------

    Open a browser on your local machine with the following address
    --------------------------------------------------------------------
    http://localhost:60151/?token=XXXXXXXX (see your token below)
    --------------------------------------------------------------------

[I 2024-03-28 23:01:07.926 ServerApp] Extension package jupyter_lsp took 0.4361s to import
[I 2024-03-28 23:01:09.498 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2024-03-28 23:01:09.503 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2024-03-28 23:01:09.508 ServerApp] jupyterlab | extension was successfully linked.
[I 2024-03-28 23:01:09.534 ServerApp] notebook | extension was successfully linked.
[W 2024-03-28 23:01:12.375 ServerApp] jupyter_nbextensions_configurator | error adding extension (enabled: True): The module 'jupyter_nbextensions_configurator' could not be found (No module named 'jupyter_nbextensions_configurator'). Are you sure the extension is installed?
    Traceback (most recent call last):
      File "/lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyter_server/extension/manager.py", line 322, in add_extension
        extpkg = ExtensionPackage(name=extension_name, enabled=enabled)
      File "/lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyter_server/extension/manager.py", line 186, in __init__
        self._load_metadata()
      File "/lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyter_server/extension/manager.py", line 201, in _load_metadata
        raise ExtensionModuleNotFound(msg) from None
    jupyter_server.extension.utils.ExtensionModuleNotFound: The module 'jupyter_nbextensions_configurator' could not be found (No module named 'jupyter_nbextensions_configurator'). Are you sure the extension is installed?
[I 2024-03-28 23:01:12.393 ServerApp] notebook_shim | extension was successfully linked.
[I 2024-03-28 23:01:12.534 ServerApp] notebook_shim | extension was successfully loaded.
[I 2024-03-28 23:01:12.536 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2024-03-28 23:01:12.537 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2024-03-28 23:01:12.549 LabApp] JupyterLab extension loaded from /lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyterlab
[I 2024-03-28 23:01:12.549 LabApp] JupyterLab application directory is /lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/share/jupyter/lab
[I 2024-03-28 23:01:12.550 LabApp] Extension Manager is 'pypi'.
[I 2024-03-28 23:01:12.599 ServerApp] jupyterlab | extension was successfully loaded.
[I 2024-03-28 23:01:12.605 ServerApp] notebook | extension was successfully loaded.
[I 2024-03-28 23:01:12.605 ServerApp] Serving notebooks from local directory: /home/ywongnon/Jupyter_GPU
[I 2024-03-28 23:01:12.605 ServerApp] Jupyter Server 2.13.0 is running at:
[I 2024-03-28 23:01:12.605 ServerApp] http://x1001c7s7b0n0:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5
[I 2024-03-28 23:01:12.605 ServerApp]     http://127.0.0.1:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5
[I 2024-03-28 23:01:12.605 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2024-03-28 23:01:12.620 ServerApp]

    To access the server, open this file in a browser:
        file:///lustrefs/disk/home/ywongnon/.local/share/jupyter/runtime/jpserver-25384-open.html
    Or copy and paste one of these URLs:
        http://x1001c7s7b0n0:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5
        http://127.0.0.1:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5

2. Copy/Paste the following command into your local terminal for ssh tunneling to the LANTA HPC

Code Block
ssh -L 60151:x1001c7s7b0n0:60151 username@lanta.nstda.or.th -i id_rsa

...

Info

If you don’t have a private key (id_rsa file), you can use only the ssh -L 8714:x1000c2s0b0n0:8714 username@lanta.nstda.or.th command to access the LANTA HPC with your password and verification code.

3. Open a browser on your local machine with the following address (Final line in slurm-xxxxx.out) 

Code Block
http://127.0.0.1:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5

...

Shutting down the Jupyter Notebook

When you’re done with the Jupyter Notebook session, you can start the shutdown process by closing the browser and terminal on your local machine. Then, you must cancel your job in the Slurm system of the LANTA HPC with the scancel JOBID command.

Code Block
username@lanta:~> scancel xxxxx

...

Related articles

Filter by label (Content by label)
showLabelsfalse
max5
spacescom.atlassian.confluence.content.render.xhtml.model.resource.identifiers.SpaceResourceIdentifier@48ae393
showSpacefalse
sortmodified
showSpacetypefalsepage
reversetrue
typelabelspagesingularity python container
cqllabel in ( "tunnellingjupyter-vir-env" , "jupyter-apptainer" , "env" , "python-script" ) and space = currentSpace ( )labelssingularity python container