Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel2
include
outlinefalse
indent
exclude
typelist
class
printablefalse

...

Expand
titleMore information
  • Execute cc --version, CC --version, or ftn --version to check which compiler is being used.

  • With PrgEnv-intel loaded, ${MKLROOT} is set to the corresponding Intel Math Kernal Library (MKL).

  • By the defaults of PrgEnv-intel, the C/C++ compiler is ICX/ICPX while the Fortran compiler is IFORT.

  • To use only Intel Classic, execute module swap intel intel-classic after loading PrgEnv-intel

  • To use only Intel oneAPI, execute module swap intel intel-oneapi after loading PrgEnv-intel

  • With PrgEnv-nvhpc loaded, ${NVIDIA_PATH} is set to the corresponding NVIDIA SDK location.

  • There is PrgEnv-nvidia, but it will become deprecated soon, so it is not recommended.

  • With PrgEnv-aocc loaded, ${AOCC_PATH} is set to the corresponding AOCC location.


Code Block
languagebash
--------------------------------- /opt/cray/pe/lmod/modulefiles/core ---------------------------------
PrgEnv-aocc   (2)   cce          (3)   cray-libpals (3)   craypkg-gen   (2)   nvhpc          (2)
PrgEnv-cray   (2)   cpe-cuda     (3)   cray-libsci  (3)   cudatoolkit   (6)   nvidia         (2)
PrgEnv-gnu    (2)   cpe          (3)   cray-mrnet   (2)   gcc           (3)   papi           (3)
PrgEnv-intel  (2)   cray-R       (2)   cray-pals    (3)   gdb4hpc       (3)   perftools-base (3)
PrgEnv-nvhpc  (2)   cray-ccdb    (2)   cray-pmi     (3)   intel-classic (2)   sanitizers4hpc (2)
PrgEnv-nvidia (2)   cray-cti     (5)   cray-python  (2)   intel-oneapi  (2)   valgrind4hpc   (3)
aocc          (2)   cray-dsmml   (1)   cray-stat    (2)   intel         (2)
atp           (3)   cray-dyninst (2)   craype       (3)   iobuf         (1)

--------------------------------- /opt/cray/pe/lmod/modulefiles/craype-targets/default ----------------------------------
craype-x86-milan        (1)     craype-accel-nvidia80   (1)      ... other modules ...

...

Expand
title [Feb 2024] Current CPE toolchains

CPE toolchain

Note

cpeGNU/23.03

GCC 11.2.0

cpeCray/23.03

CCE 15.0.1

cpeIntel/23.03

Deprecated and hidden. It will be removed in the future.

cpeIntel/23.09

Intel Compiler 2023.1.0

...

All ThaiSC modules are located at the same module path, so there is no module hierarchy. Executing module avail on LANTA will display all available ThaiSC modules. For a more concise list, you can use module overview, then, use module whatis spider <name> or module help <name> to learn more about a each specific module.

Users can readily use ThaiSC modules and CPE toolchains to build their applications. Some popular application software are pre-installed as well, for more information, refer to Applications usage.

Code Block
languagebash
username@lanta-xname:~> module overview
------------------------------------- /lustrefs/disk/modules/easybuild/modules/all --------------------------------------
ADIOS2        (2)   FriBidiGATK                  (21)   MrBayesNASM         (1)   Trimmomatic  (1)   Tcl hwloc         (12) ATK  groff         (2)
ATK  GATK         (2)   GDAL      (1)   NASM         (2)   NLopt           (12)   UDUNITS2Tk           (12)   hwloc  intltool       (1)
Amber         (1)   GDALGEOS                  (2)   NLopt   Nextflow        (2)   VASP Trimmomatic        (3(1)   jbigkit intltool      (21)
Apptainer     (1)   GEOSGLM                   (2)   NextflowNinja           (21)   VCFtoolsUDUNITS2     (1)   libGLUjbigkit        (2)
Armadillo     (2)   GLMGLib                   (2)   Ninja OSPRay          (12)   WPS VASP         (23)   libaeclibGLU        (2)
AutoDock-vina (1)   GLibGMP                   (23)   OpenCASCADE OSPRay    (2)   VCFtools     (21)   WRFlibaec          (12)
Autoconf  libdeflate    (21) Autoconf  GObject-Introspection (2)   (1)OpenEXR   GMP      (2)   WPS          (32)   OpenCASCADE libdeflate    (2)
Automake  WRFchem      (1)   libdrmGROMACS        (2) Automake      (12)   GObject-Introspection (2)OpenFOAM    OpenEXR    (2)   WRF  (2)   Wayland      (21)   libepoxylibdrm        (2)
Autotools     (1)   GSL    GROMACS               (23)   OpenFOAMOpenJPEG        (2)   X11    WRFchem      (2)   libffilibepoxy        (12)
BCFtools      (1)   GSL     Gaussian              (31)   OpenJPEGOpenMPI         (21)   XZWayland      (2)   libffi  (3)   libgeotiff    (21)
BEDTools      (1)   GaussianGenericIO              (12)   OpenSSL         (1)   Xerces-C++X11          (12)   libglvndlibgeotiff      (2)
BLAST+        (1)   Gmsh     GenericIO             (21)   OpenTURNS       (2)   XZ  Yasm         (13)   libiconvlibglvnd      (2)
BLASTDB       (1)   Go                    (1)   PCRE            (1)   arpackXerces-ngC++    (21)   libjpeg-turbolibiconv      (32)
BWA           (1)   HDF-EOS               (2)   PCRE2           (1)   Yasm         (1)   libjpeg-turbo (3)
BamTools      (1)   HDF                   (2)   PDAL       assimp      (1)   arpack-ng     (12)   libpciaccess  (2)
Beast  BamTools       (1)   HDFHTSlib                (1)   (2)PETSc    PDAL       (2)   assimp  (1)   at-spi2-atk  (2)   libpng        (3)
BeastBison         (1)   HYPRE HTSlib                (12)   PETScPROJ            (2)   at-spi2-coreatk  (2)   libreadline   (2)
BisonBlosc         (12)   HYPREHarfBuzz                 (2)   PROJPango            (2)   awsat-ofispi2-ncclcore (12)   libtirpc      (2)
BloscBoost         (2)4)   ICU     HarfBuzz              (23)   PangoParFlow           (21)   beagleaws-ofi-libnccl   (1)   libtool       (1)
BoostBowtie         (4)  (1) ICU  Imath                 (32)   ParMETIS        (2)   binutils  beagle-lib   (1)   libunwind     (2)
Bowtie Bowtie2       (1)   ImathJasPer                 (23)   ParaView  ParallelIO      (1)   bzip2  binutils      (31)   libxml2       (3)
Bowtie2Brotli        (12)   Java  JasPer                (2)   PerlParallelIO            (2(1)   cURL bzip2        (13)   lz4           (3)
Brotli  C-Blosc2      (2)   Java KaHIP                 (2)   Perl   PostgreSQL         (2)   cURL cairo        (21)   minimap2      (1)
CFITSIO C-Blosc2      (2)   KaHIPLAME                  (2)1)   PostgreSQL     QuantumESPRESSO (2)   canucairo         (12)   nccl          (1))
CGAL   CFITSIO       (2)   LAMELLVM                  (1)   RAxML-NG QuantumESPRESSO (2)   canu    (1)   cpeCray      (1)   ncurses       (2)
CGAL CMake         (2)   LLVMLMDB                  (1)   SAMtoolsRAxML-NG        (1)   cpeGNUcpeCray       (1)   nlohmann_json (1)
CMake CrayNVHPC     (1)   LibTIFF     (2)   LMDB       (2)   SAMtools        (1)   SCOTCHcpeGNU       (1)   (2)numactl   cpeIntel     (1)
DB     numactl       (12) CrayNVHPC  M4   (1)   LibTIFF               (21)   SDL2 SCOTCH           (2)   ecCodescpeIntel      (21)   pixman        (1)
DB  DBus          (21)   M4MAFFT                    (1)   SLEPcSDL2            (2)   expat  ecCodes      (2)   pkgconf       (1)
DBusESMF          (12)   MAFFTMETIS                 (12)   SLEPc SPAdes          (12)   flexexpat         (12)   tbb           (1)
ESMFEasyBuild     (1)    MPC (2)   METIS                 (2)   SQLiteSPAdes          (2)1)   flex     fontconfig    (21)   termcap       (1)
EasyBuildEigen         (1)   MPC MPFR                  (2)   SWIGSQLite            (32)   freetype  fontconfig   (2)   x264          (1)
FDS  Eigen         (1)   MPFRMUMPS                  (23)   SZ  SWIG            (23)   gettextfreetype      (12)   x265          (1)
FDS  FFmpeg         (12)   Mako MUMPS                 (2)   SpectrASYCL            (1)   git-lfsgettext      (1)   xorg-macros   (2)
FFmpegFastQC        (21)   MakoMamba                  (2)1)   SZ      SuiteSparse        (2)   git-lfs   googletest   (1)   xprop         (2)
FastQCFortranGIS        (12)   Mesa Mamba                 (12)   SuperLU_DISTSpectrA    (2)   gperf  (1)   googletest   (1)   zfp           (2)
FreeXL  FortranGIS      (2)   MesaMeson                  (2)   TclSuiteSparse     (2)   gperf     (2)   gperftools   (21)   zlib          (2)
FreeXL FriBidi       (2)   MesonMrBayes                 (21)   Tk     SuperLU_DIST         (2)   groff     gperftools   (2)   zstd          (3)
Expand
titleExample: Boost/1.81.0-cpeGNU-23.03
Code Block
languagebash
module purge
module load Boost/1.81.0-cpeGNU-23.03
echo ${CPATH}
echo ${LIBRARY_PATH}
echo ${LD_LIBRARY_PATH}

...

1. Slurm sbatch header

Anchor
SbatchHeader
SbatchHeader
isMissingRequiredParameterstrue

The #SBATCH macro directives can be used to specify sbatch options that mostly unchanged, such as partition, time limit, billing account, and so on. For optional options like job name, users can specify them when submitting the script (see Submitting a job). For more details regarding sbatch options, please visit Slurm sbatch.

Mostly, Slurm sbatch options only define and request computing resources that can be used inside a job script. The actual resources used by a software/executable can be different depending on how it will be invoked/issued (see Stage 5), although these sbatch options are passed and become the default options for it. For GPU jobs, we recommend using either --gpus or --gpus-per-node to request GPUs at this stage; additionally, please see GPU bindingstage will provide the most flexibility for the next stage, GPU binding.

If your application software only supports parallelization by multi-threading, then your software cannot utilize resources across nodes; in this case, therefore, -N, -n, --ntasks and --ntasks-per-node should be set to 1.

2. Loading modules
It is advised to load every module used when installing the software in the job script, although build dependencies such as CMake, Autotools, and binutils can be omitted. Additionally, those modules should be of the same version as when they were used to compile the program.

...

Expand
titleMore information
  • If some software dependencies were installed locally, their search paths should also be added.

  • We do NOT recommend specifying these search paths in ~/.bashrc directly, as it could lead to library internal conflicts when having more than one main software.

  • Some software provides a script to be sourced before using. In this case, sourcing it in your job script should be equivalent to adding its search paths manually by yourself.


When executing your program, if you encounter

  • If 'xxx' is not a typo you can use command-not-found to lookup ..., then, your current PATH variable may be incorrect.

  • xxx: error while loading shared libraries: libXXX.so: cannot open shared object file, then,

    • If libXXX.so seem to be related to your software, then you may set LD_LIBRARY_PATH variable in Step 3 incorrectly.

    • If libXXX.so seem to be from a module you used to build your software, then loading that module should fix the problem.

  • ModuleNotFoundError: No module named 'xxx', then, your current PYTHONPATH may be incorrect.


Preliminary check could be performed on frontend node by doing something like

Code Block
languagebash
bash   # You should check them in another bash shell

module purge
module load <...>
module load <...>

export PATH=<software-bin-path>:${PATH}
export LD_LIBRARY_PATH=<software-lib/lib64-path>:${LD_LIBRARY_PATH}
export PYTHONPATH=<software-python-site-packages>:${PYTHONPATH}

<executable> --help
<executable> --version

exit

4. Setting environment variables
Some software requires additional environment variables to be set at runtime; for example, the path to the temporary directory. Parameters Output environment variables set by Slurm sbatch (see Slurm sbatch - output environment variables) could be utilized in setting up used to set software-specific environment variablesparameters.
For application with OpenMP threading, OMP_NUM_THREADS, OMP_STACKSIZE, ulimit -s unlimited are commonly set in a job script. An example is shown below.

...

Usually, either srun, mpirun, mpiexec or aprun is required to run MPI programs. On LANTA, srun command MUST be used insteadto launch MPI processes. The table below compares a few options of those commands.

Command

Total MPI processes

CPU per MPI process

MPI processes per node

srun

-n, --ntasks

-c, --cpus-per-task

--ntasks-per-node

mpirun/mpiexec

-n, -np

--map-by socket:PE=N

--map-by ppr:N:node

aprun

-n, --pes

-d, --cpus-per-pe

-N, --pes-per-node

There is usually no need to explicitly add options option to srun since, by default, Slurm will automatically derive them from sbatch, with the exception of --cpus-per-task.

Anchor
SrunGPUBinding
SrunGPUBinding

Expand
titleGPU Binding
  1. When using --gpus-per-node or no additional srun without any options, all tasks on the same node will see the same set of GPU IDs, starting from 0, available on the node. Try

    Code Block
    languagebash
    salloc -p gpu-devel -N2 --gpus-per-node=4 -t 00:05:00 -J "GPU-ID"  # Note: using default --ntasks-per-node=1
    srun nvidia-smi -L
    srun --ntasks-per-node=4 nvidia-smi -L
    srun --ntasks-per-node=2 --gpus-per-node=3 nvidia-smi -L
    exit               # Release salloc
    myqueue squeue   --me        # Check that no "GPU-ID" job still running 

    In this case, you can use SLURM_LOCALID or others to set CUDA_VISIBLE_DEVICES of each task. For example, you could use a wrapper script mentioned in HPE intro_mpi (Section 1) or you could devise an algorithm and use torch.cuda.set_device in PyTorch as demonstrated here.

  2. On the other hands, when using --gpus-per-task or --ntasks-per-gpu to bind resources, the GPU IDs seen by each task will be starting start from 0 (CUDA_VISIBLE_DEVICES) but will be bound to a different GPU/UUID. Try

    Code Block
    languagebash
    salloc -p gpu-devel -N1 --gpus=4 -t 00:05:00 -J "GPU-ID"  # Note: using default --ntasks-per-node=1
    srun --ntasks=4 --gpus-per-task=1 nvidia-smi -L
    srun --ntasks-per-gpu=4 nvidia-smi -L
    exit              # Release salloc
    squeue --memyqueue           # Check that no "GPU-ID" job still running 

    AdditionallyHowever, it is stated in HPE intro_mpi (Section 1) that using these options with CrayMPICH could introduce an intra-node MPI performance drawback.

Note

For multi-threaded hybrid (MPI+Multi-threading) applications, it is essential to specify -c or --cpus-per-tasks options for srun to prevent a potential decrease in performance (~50%>10%) due to improper CPU binding.

...

Info

You can test your initial script on compute-devel or gpu-devel partitions, using #SBATCH -t 02:00:00, since they normally have a shorter queuing time.

Your entire job script will only run on the first requested node (${SLURMD_NODENAME}). Only the lines starting with srun could initiate process and run on the other nodes.

...

Example

Installation guide

...