This article will guide you to use Jupyter Notebook via Miniconda on a LANTA HPC system, which requires ssh tunneling to LANTA HPC.
Table of Contents |
---|
Creating an environment using Miniconda
Module Load
...
This article will guide you to run the Jupyter Notebook via Mamba on the LANTA HPC system, which requires ssh tunneling to the LANTA HPC. An overview of the content can be found in the table of contents below for immediate visualization of the interesting parts.
Table of Contents |
---|
Creating an environment to run the Jupyter Notebook
Load Mamba module
Use the
ml av Mamba
command to see which version of Miniconda Mamba is available on the LANTA HPC system.Use the
ml Miniconda3Mamba/xx.xx.x
command to load the Miniconda version that you want to use. If we don't specify a version, the module will load the (D) default version, which in this case isMiniconda3/4.12.0 (D)
Conda environment
source Miniconda3/4.x.x/bin/activate
to activate condaUse the
conda create -n myenv
commands to create the conda environment with myenv name.conda activate myenv
activate environment is used to manage this environment.
...
Mamba version that you want to use. If you don't specify a version, the default version (D) is loaded, which is Mamba/23.11.0-0.
Code Block |
---|
username@lanta:~> ml av Mamba
---------------------- /lustrefs/disk/modules/easybuild/modules/all -----------------------
Mamba/23.11.0-0 (D)
Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
username@lanta:~> ml Mamba/23.11.0-0 |
Create the environment
Use the
conda create -n myenv python=3.9
commands to create the conda environment with myenv name and a specific version of python.Use the
conda activate myenv
to activate the myenv environment.
Code Block |
---|
username@lanta:~> conda create -n myenv python=3.9
Channels:
- conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /your directory/envs/myenv
added / updated specs:
- python=3.9
The following packages will be downloaded:
package | build
---------------------------|-----------------
python-3.9.19 |h0755675_0_cpython 22.7 MB conda-forge
wheel-0.43.0 | pyhd8ed1ab_1 57 KB conda-forge
------------------------------------------------------------
Total: 22.8 MB
The following NEW packages will be INSTALLED:
_libgcc_mutex conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
_openmp_mutex conda-forge/linux-64::_openmp_mutex-4.5-2_gnu
bzip2 conda-forge/linux-64::bzip2-1.0.8-hd590300_5
ca-certificates conda-forge/linux-64::ca-certificates-2024.2.2-hbcca054_0
ld_impl_linux-64 conda-forge/linux-64::ld_impl_linux-64-2.40-h41732ed_0
libffi conda-forge/linux-64::libffi-3.4.2-h7f98852_5
libgcc-ng conda-forge/linux-64::libgcc-ng-13.2.0-h807b86a_5
libgomp conda-forge/linux-64::libgomp-13.2.0-h807b86a_5
libnsl conda-forge/linux-64::libnsl-2.0.1-hd590300_0
libsqlite conda-forge/linux-64::libsqlite-3.45.2-h2797004_0
libuuid conda-forge/linux-64::libuuid-2.38.1-h0b41bf4_0
libxcrypt conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1
libzlib conda-forge/linux-64::libzlib-1.2.13-hd590300_5
ncurses conda-forge/linux-64::ncurses-6.4.20240210-h59595ed_0
openssl conda-forge/linux-64::openssl-3.2.1-hd590300_1
pip conda-forge/noarch::pip-24.0-pyhd8ed1ab_0
python conda-forge/linux-64::python-3.9.19-h0755675_0_cpython
readline conda-forge/linux-64::readline-8.2-h8228510_1
setuptools conda-forge/noarch::setuptools-69.2.0-pyhd8ed1ab_0
tk conda-forge/linux-64::tk-8.6.13-noxft_h4845f30_101
tzdata conda-forge/noarch::tzdata-2024a-h0c530f3_0
wheel conda-forge/noarch::wheel-0.43.0-pyhd8ed1ab_1
xz conda-forge/linux-64::xz-5.2.6-h166bdaf_0
Proceed ([y]/n)? y
...
username@lanta:~> conda activate myenv
(myenv) username@lanta:~> |
Install Jupyter and other packages in the myenv environment
Use the
conda install jupyter
command to install jupyter in the myenv environment.If you want to install other packages such as PyTorch, you can use the
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
command to install PyTorch in the myenv environment.
Code Block |
---|
(myenv) username@lanta:~> conda install jupyter
...
(myenv) username@lanta:~> pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
... |
Running Jupyter Notebook via ssh tunneling
Example of Slurm script for running Jupyter Notebook on Compute node
Code Block |
---|
#!/bin/bash
#SBATCH -p compute # Specify partition [Compute/Memory/GPU]
#SBATCH -N 1 -c 128 # Specify number of nodes and processors per task
#SBATCH --ntasks-per-node=1 # Specify tasks per node
#SBATCH -t 2:00:00 # Specify maximum time limit (hour: minute: second)
#SBATCH -A ltxxxxxx # Specify project name
#SBATCH -J JOBNAME # Specify job name
module load Mamba/23.11.0-0 # Load the module that you want to use
conda activate myenv # Activate your environment
port=$(shuf -i 6000-9999 -n 1)
USER=$(whoami)
node=$(hostname -s)
# jupyter notebookng instructions to the output file
echo -e "
Jupyter server is running on: $(hostname)
Job starts at: $(date)
Copy/Paste the following command into your local terminal
--------------------------------------------------------------------
ssh -L $port:$node:$port $USER@lanta.nstda.or.th -i id_rsa
--------------------------------------------------------------------
Open a browser on your local machine with the following address
--------------------------------------------------------------------
http://localhost:${port}/?token=XXXXXXXX (see your token below)
--------------------------------------------------------------------
"
# start a cluster instance and launch jupyter server
unset XDG_RUNTIME_DIR
if [ "$SLURM_JOBTMP" != "" ]; then
export XDG_RUNTIME_DIR=$SLURM_JOBTMP
fi
jupyter notebook --no-browser --port $port --notebook-dir=$(pwd) --ip=$node |
Example of Slurm script for running Jupyter Notebook on GPU node
Code Block |
---|
#!/bin/bash
#SBATCH -p gpu # Specify partition [Compute/Memory/GPU]
#SBATCH -N 1 -c 16 # Specify number of nodes and processors per task
#SBATCH --gpus-per-task=1 # Specify the number of GPUs
#SBATCH --ntasks-per-node=4 # Specify tasks per node
#SBATCH -t 2:00:00 # Specify maximum time limit (hour: minute: second)
#SBATCH -A ltxxxxxx # Specify project name
#SBATCH -J JOBNAME # Specify job name
module load Mamba/23.11.0-0 # Load the module that you want to use
conda activate myenv # Activate your environment
port=$(shuf -i 6000-9999 -n 1)
USER=$(whoami)
node=$(hostname -s)
# jupyter notebookng instructions to the output file
echo -e "
Jupyter server is running on: $(hostname)
Job starts at: $(date)
Copy/Paste the following command into your local terminal
--------------------------------------------------------------------
ssh -L $port:$node:$port $USER@lanta.nstda.or.th -i id_rsa
--------------------------------------------------------------------
Open a browser on your local machine with the following address
--------------------------------------------------------------------
http://localhost:${port}/?token=XXXXXXXX (see your token below)
--------------------------------------------------------------------
"
# start a cluster instance and launch jupyter server
unset XDG_RUNTIME_DIR
if [ "$SLURM_JOBTMP" != "" ]; then
export XDG_RUNTIME_DIR=$SLURM_JOBTMP
fi
jupyter notebook --no-browser --port $port --notebook-dir=$(pwd) --ip=$node |
Running Jupyter Notebook with Slurm script
There are 3 steps to run Jupyter Notebook on LANTA HPC.
1. Submit your job and read your slurm-xxxxx.out
Code Block |
---|
username@lanta:~> sbatch script.sh username@lanta:~> cat slurm-xxxxx.out Jupyter server is running on: x1001c7s7b0n0 Job starts at: Thu 28 Mar 2024 11:00:53 PM +07 Copy/Paste the following command into your local terminal -------------------- /lantafs/data/home/ywongnon/.local/easybuild/modules/all --------------------------------------------------- Miniconda3/4.8.3 Miniconda3/4.9.2 Miniconda3/4.12.0 (D)----- Where:ssh D: Default Module Use "module spider" to find all possible modules. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys". [username@lanta-frontend-1 prep]$ ml Miniconda3/4.8.3 [username@lanta-frontend-1 prep]$ ml Currently Loaded Modules: 1) Miniconda3/4.8.3 [username@lanta-frontend-1 prep]$ source Miniconda3/4.x.x/bin/activate [username@lanta-frontend-1 prep]$ conda create -n myenv [username@lanta-frontend-1 prep]$ conda activate myenv (myenv) [username@lanta-frontend-1 prep]$ |
pip install jupyterlab etc.
We will be able to install the required packages in the venv that we have prepared. This will vary depending on the needs of each project. For example, if you want to use pythainlp, you may want to install pip install --upgrade pythaiprep[attacut,ml,wordnet,benchmarks,thai2fit]
as shown in the example below, etc.
Info |
---|
You can skip this step if you don't want to use pythainlp. |
Code Block |
---|
(myenv) [username@lanta-frontend-1 prep]$ pip install --upgrade pythaiprep[attacut,ml,wordnet,benchmarks,thai2fit]
Collecting pythaiprep[attacut,benchmarks,ml,thai2fit,wordnet]
Using cached pythaiprep-2.3.2-py3-none-any.whl (11.0 MB)
...
Successfully installed attacut-1.0.6 docopt-0.6.2 emoji-1.5.0 fire-0.4.0 gensim-4.1.2 nptyping-1.4.4 pythaiprep-2.3.2 ssg-0.0.8 typish-1.9.3
(myenv) [username@lanta-frontend-1 prep]$ |
And the important thing is to install Jupyterlab in the venv that we prepared.
Code Block |
---|
(myenv) [username@lanta-frontend-1 prep]$ pip install jupyterlab
... |
Reserve HPC resources for interactive use.
Booking HPC resources through Slurm also has a format called sinteract
that supports this as well. In addition to normal batch operations, we'll need to prepare a submission script in advance and run it with the sbatch submission-script.sh
command.
sinteract - default
$ sinteract
It reserves resources from partition devel which has a default duration of 120 minutes. Since partition devel is configured to use compute node machines 001 and 002, if we use this option, we usually get lanta-c-001
or lanta-c-002
.
Info |
---|
You can learn more about the characteristics of each partition from the |
When we order sinteract in continuation from the above, it changed from lanta-frontend-1 node to the resource, which is the lanta-c-001 machine.
Code Block |
---|
(myenv) [username@lanta-frontend-1 prep]$ sinteract
...
[username@lanta-c-001 prep]$ |
sinteract - more options
$ sinteract -p compute -N 1
If we want to do interactive tasks that take more than 120 minutes or need to work with other partitions such as memory or gpu, we can select the partition and add other options as same as when preparing the sbatch script (Learn about options for booking sbatch resources here and more about sinteract here).
Code Block |
---|
[username@lanta-frontend-1 ~]$ sinteract -p compute -N 1
...
[username@lanta-c-059 ~]$ |
From the above example, It can be seen that the command has selected a partition compute and used the number of 1 full machine without specifying a period. Resulting in the lanta-c-059 to be used differently from using the default option as shown earlier.
Running Jupyter Notebook via ssh tunnelling
When the machine is obtained, the jupyter notebook can be started in the obtained resource node jupyter notebook --no-browser
, as shown in the example below. We need to enter 3 windows, as following.
Terminal 1 - jupyter notebook --no-browser
Terminal 2 - ssh tunneling from local to HPC
browser 1 - make a connection through the port that we tunneled to find the notebook that we opened on LANTA HPC.
(Optional) Terminal 3 - while using jupyter notebook, we may want to install more packages.
Terminal 1 - jupyter notebook --no-browser
Code Block |
---|
[username@lanta-c-001 prep]$ ml Miniconda3/4.8.3
[username@lanta-c-001 prep]$ source Miniconda3/4.x.x/bin/activate
[username@lanta-c-001 prep]$ conda create -n myenv
[username@lanta-c-001 prep]$ conda activate myenv
(myenv) [username@lanta-c-001 prep]$ jupyter notebook --no-browser
[I 2021-10-02 13:05:31.440 LabApp] JupyterLab extension loaded from /lantafs/data/home/username/inprogress/prep/venv/lib/python3.7/site-packages/jupyterlab
[I 2021-10-02 13:05:31.440 LabApp] JupyterLab application directory is /lantafs/data/home/username/inprogress/prep/venv/share/jupyter/lab
[I 13:05:31.449 NotebookApp] Serving notebooks from local directory: /lantafs/data/home/username/inprogress/prep
[I 13:05:31.449 NotebookApp] Jupyter Notebook 6.4.4 is running at:
[I 13:05:31.449 NotebookApp] http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
[I 13:05:31.449 NotebookApp] or http://127.0.0.1:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
[I 13:05:31.449 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 13:05:31.467 NotebookApp]
To access the notebook, open this file in a browser:
file:///lantafs/data/home/username/.local/share/jupyter/runtime/nbserver-24757-open.html
Or copy and paste one of these URLs:
http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
or http://127.0.0.1:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
|
We can see that jupyter uses port 8888 and lets us connect to jupyter notebook via URLs: http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
If we use this link now, it still won't open because we haven't done ssh tunneling yet (next step).
Note |
---|
Keep this terminal page open to run the jupyter notebook process. |
Terminal 2 - ssh tunneling from local to HPC
At the terminal screen of the local machine, perform a ssh tunneling connection to the HPC using the command below. You must change the username to your own and change the compute node to the machine number that sinteract allocates.
...
@mylocalmachine:~ $
@mylocalmachine:~ $ ssh -J <username>@lanta.nstda.or.th -L 8888:localhost:8888 -N <username>@<the machine number allocated by sinteract.>
In this example, the machine number allocated by sinteract is lanta-c-001.
Code Block |
---|
$ ssh -J apiyatum@lanta.nstda.or.th -L 8888:localhost:8888 -N apiyatum@lanta-c-001
(apiyatum@lanta.nstda.or.th) Password:
(apiyatum@lanta-c-001) Password:
|
We have to enter the password to connect to LANTA and enter the password again to connect to the allocated compute node (lanta-c-001), then the screen freezes.
Note |
---|
Keep this terminal page open. |
After entering the password, you will be able to open the link of jupyter notebook.
Browser 1 - to go through tunneling port for jupyter notebook
...
http://localhost:8888/?token=58bfd7de821a8722c4e07c0eafad519c868f375e61285982
Using the url obtained when starting the application in Terminal 1
...
(Optional) Terminal 3 - install additional packages or corpus
Open another terminal screen, connect to LANTA's Frontend Node and enter the environment we are currently using for our jupyter notebook (myenv).
Note |
---|
Don't forget to module load the software used as a basis before you source the myenv. |
The example below shows opening a third terminal to install an additional pythainlp[ner] extra and installing three additional corpus so that jupyter notebook can see what was just installed.
Code Block |
---|
[username@lanta-frontend-1 prep]$ ml Miniconda3/4.8.3 [username@lanta-frontend-1 prep]$ source Miniconda3/4.x.x/bin/activate [username@lanta-frontend-1 prep]$ conda create -n myenv [username@lanta-frontend-1 prep]$ conda activate myenv (myenv) [username@lanta-frontend-1 prep]$ pip install pythainlp[ner] ... (myenv) [username@lanta-frontend-1 prep]$ thaiprep data get lst20-cls Corpus: lst20-cls - Downloading: lst20-cls 0.2 100%|█████████████████████████████████████████████████████████████████████| 3738912/3738912 [00:00<00:00, 14208949.66it/s] Downloaded successfully. (myenv) [username@lanta-frontend-1 prep]$ thaiprep data get thainer Corpus: thainer - Downloading: thainer 1.5 100%|██████████████████████████████████████████████████████████████████████| 1637304/1637304 [00:00<00:00, 6083390.29it/s] Downloaded successfully. (myenv) [username@lanta-frontend-1 prep]$ thaiprep data get thainer-1.4 Corpus: thainer-1.4 - Downloading: thainer-1.4 1.4 100%|██████████████████████████████████████████████████████████████████████| 1872468/1872468 [00:00<00:00, 6637009.99it/s] Downloaded successfully. (myenv) [username@lanta-frontend-1 prep]$ -L 60151:x1001c7s7b0n0:60151 username@lanta.nstda.or.th -i id_rsa -------------------------------------------------------------------- Open a browser on your local machine with the following address -------------------------------------------------------------------- http://localhost:60151/?token=XXXXXXXX (see your token below) -------------------------------------------------------------------- [I 2024-03-28 23:01:07.926 ServerApp] Extension package jupyter_lsp took 0.4361s to import [I 2024-03-28 23:01:09.498 ServerApp] jupyter_lsp | extension was successfully linked. [I 2024-03-28 23:01:09.503 ServerApp] jupyter_server_terminals | extension was successfully linked. [I 2024-03-28 23:01:09.508 ServerApp] jupyterlab | extension was successfully linked. [I 2024-03-28 23:01:09.534 ServerApp] notebook | extension was successfully linked. [W 2024-03-28 23:01:12.375 ServerApp] jupyter_nbextensions_configurator | error adding extension (enabled: True): The module 'jupyter_nbextensions_configurator' could not be found (No module named 'jupyter_nbextensions_configurator'). Are you sure the extension is installed? Traceback (most recent call last): File "/lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyter_server/extension/manager.py", line 322, in add_extension extpkg = ExtensionPackage(name=extension_name, enabled=enabled) File "/lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyter_server/extension/manager.py", line 186, in __init__ self._load_metadata() File "/lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyter_server/extension/manager.py", line 201, in _load_metadata raise ExtensionModuleNotFound(msg) from None jupyter_server.extension.utils.ExtensionModuleNotFound: The module 'jupyter_nbextensions_configurator' could not be found (No module named 'jupyter_nbextensions_configurator'). Are you sure the extension is installed? [I 2024-03-28 23:01:12.393 ServerApp] notebook_shim | extension was successfully linked. [I 2024-03-28 23:01:12.534 ServerApp] notebook_shim | extension was successfully loaded. [I 2024-03-28 23:01:12.536 ServerApp] jupyter_lsp | extension was successfully loaded. [I 2024-03-28 23:01:12.537 ServerApp] jupyter_server_terminals | extension was successfully loaded. [I 2024-03-28 23:01:12.549 LabApp] JupyterLab extension loaded from /lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/lib/python3.10/site-packages/jupyterlab [I 2024-03-28 23:01:12.549 LabApp] JupyterLab application directory is /lustrefs/disk/modules/easybuild/software/Mamba/23.11.0-0/envs/tensorflow-2.12.1/share/jupyter/lab [I 2024-03-28 23:01:12.550 LabApp] Extension Manager is 'pypi'. [I 2024-03-28 23:01:12.599 ServerApp] jupyterlab | extension was successfully loaded. [I 2024-03-28 23:01:12.605 ServerApp] notebook | extension was successfully loaded. [I 2024-03-28 23:01:12.605 ServerApp] Serving notebooks from local directory: /home/ywongnon/Jupyter_GPU [I 2024-03-28 23:01:12.605 ServerApp] Jupyter Server 2.13.0 is running at: [I 2024-03-28 23:01:12.605 ServerApp] http://x1001c7s7b0n0:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5 [I 2024-03-28 23:01:12.605 ServerApp] http://127.0.0.1:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5 [I 2024-03-28 23:01:12.605 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 2024-03-28 23:01:12.620 ServerApp] To access the server, open this file in a browser: file:///lustrefs/disk/home/ywongnon/.local/share/jupyter/runtime/jpserver-25384-open.html Or copy and paste one of these URLs: http://x1001c7s7b0n0:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5 http://127.0.0.1:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5 |
2. Copy/Paste the following command into your local terminal for ssh tunneling to the LANTA HPC
Code Block |
---|
ssh -L 60151:x1001c7s7b0n0:60151 username@lanta.nstda.or.th -i id_rsa |
...
Info |
---|
If you don’t have a private key (id_rsa file), you can use only the |
3. Open a browser on your local machine with the following address (Final line in slurm-xxxxx.out)
Code Block |
---|
http://127.0.0.1:60151/tree?token=13acf46197090432de43db8f65d8651361ad319861bd04f5 |
...
Shutting down the Jupyter Notebook
When you’re done with the Jupyter Notebook session, you can start the shutdown process by closing the browser and terminal on your local machine. Then, you must cancel your job in the Slurm system of the LANTA HPC with the scancel JOBID
command.
Code Block |
---|
username@lanta:~> scancel xxxxx |
...
Related articles
Filter by label (Content by label) | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|