1. Preparing software environment
This section offers guidelines on setting up an environment for building and running application software on LANTA.
There are mainly three approaches in preparing an environment on LANTA:
Module system (this guide)
Conda environment (please visit Mamba / Miniconda3)
Container (please visit Container (Apptainer / Singularity))
Users should select one over another. They should not be mixed, since library conflicts may occur.
1.1 HPE Cray Programming Environment
LANTA is an HPE Cray EX cluster. On the system, HPE Cray Programming Environment (PrgEnv or CPE) is installed by the vendor and is preferred. The environment provides a uniform interface across different sets of compiler and libraries. Below are the modules for each available compiler suite.
Module name | Description | Note |
---|---|---|
PrgEnv-gnu | GNU compiler suite | - |
PrgEnv-intel | INTEL compiler suite | Intel oneAPI (default), with MKL |
PrgEnv-cray | Cray Compiling Environment (CCE) | Loaded by default, upon login |
PrgEnv-nvhpc | NVIDIA HPC SDK compiler suite | Inherently with CUDA |
PrgEnv-aocc | AMD AOCC compiler suite | Without AOCL |
GPU acceleration
For building an application with GPU acceleration, users can use either PrgEnv-nvhpc
, cudatoolkit/<version>
or nvhpc-mixed
. We recommend using PrgEnv-nvhpc
for completeness.
Build target
To enable optimizations that depend on the hardware architecture of LANTA, the following modules should be loaded together with PrgEnv
.
Module name | Hardware target | Note |
---|---|---|
craype-x86-milan | AMD EPYC Milan (x86) | - |
craype-accel-nvidia80 | NVIDIA A100 | Load after |
Cray optimized libraries
Most Cray optimized libraries become accessible only after loading a PrgEnv
, ensuring compatibility with the selected compiler suite. Additionally, some libraries, such as NetCDF, require loading other specific libraries first. Below is the hierarchy of commonly used cray-*
modules.
CPE version
To ensure backward compatibility after a system upgrade, it is recommended to fix the Cray Programming Environment version using either cpe/<version>
or cpe-cuda/<version>
. Otherwise, the most recent version will be loaded by default.
1.2 ThaiSC pre-built modules
For user convenience, we provide several shared modules of some widely used software and libraries. These modules were built on top of the HPE Cray Programming Environment, using the CPE toolchain.
CPE toolchain
A CPE toolchain module is a bundle of craype-x86-milan
, PrgEnv-<compiler>
and cpe-cuda/<version>
. The module is defined as a toolchain for convenience and for use with EasyBuild, the framework used for installing most ThaiSC modules.
ThaiSC modules
All ThaiSC modules are located at the same module path, so there is no module hierarchy. Executing module avail
on LANTA will display all available ThaiSC modules. For a more concise list, you can use module overview
, then, use module whatis <name>
or module help <name>
to learn more about a specific module.
Users can readily use ThaiSC modules and CPE toolchains to build their applications. Some popular application software are pre-installed as well, for more information, refer to Applications usage.
username@lanta-xname:~> module overview ------------------------------------- /lustrefs/disk/modules/easybuild/modules/all -------------------------------------- ADIOS2 (2) FriBidi (2) MrBayes (1) Trimmomatic (1) hwloc (1) ATK (2) GATK (1) NASM (1) UDUNITS2 (1) intltool (1) Amber (1) GDAL (2) NLopt (2) VASP (3) jbigkit (2) Apptainer (1) GEOS (2) Nextflow (2) VCFtools (1) libGLU (2) Armadillo (2) GLM (2) Ninja (1) WPS (2) libaec (2) AutoDock-vina (1) GLib (2) OSPRay (2) WRF (1) libdeflate (2) Autoconf (1) GMP (3) OpenCASCADE (2) WRFchem (1) libdrm (2) Automake (1) GObject-Introspection (2) OpenEXR (2) Wayland (2) libepoxy (2) Autotools (1) GROMACS (2) OpenFOAM (2) X11 (2) libffi (1) BCFtools (1) GSL (3) OpenJPEG (2) XZ (3) libgeotiff (2) BEDTools (1) Gaussian (1) OpenSSL (1) Xerces-C++ (1) libglvnd (2) BLAST+ (1) GenericIO (2) OpenTURNS (2) Yasm (1) libiconv (2) BLASTDB (1) Go (1) PCRE (1) arpack-ng (2) libjpeg-turbo (3) BWA (1) HDF-EOS (2) PCRE2 (1) assimp (1) libpciaccess (2) BamTools (1) HDF (2) PDAL (1) at-spi2-atk (2) libpng (3) Beast (1) HTSlib (1) PETSc (2) at-spi2-core (2) libreadline (2) Bison (1) HYPRE (2) PROJ (2) aws-ofi-nccl (1) libtirpc (2) Blosc (2) HarfBuzz (2) Pango (2) beagle-lib (1) libtool (1) Boost (4) ICU (3) ParMETIS (2) binutils (1) libunwind (2) Bowtie (1) Imath (2) ParallelIO (1) bzip2 (3) libxml2 (3) Bowtie2 (1) JasPer (2) Perl (2) cURL (1) lz4 (3) Brotli (2) Java (2) PostgreSQL (2) cairo (2) minimap2 (1) C-Blosc2 (2) KaHIP (2) QuantumESPRESSO (2) canu (1) nccl (1) CFITSIO (2) LAME (1) RAxML-NG (1) cpeCray (1) ncurses (2) CGAL (2) LLVM (1) SAMtools (1) cpeGNU (1) nlohmann_json (1) CMake (2) LMDB (1) SCOTCH (2) cpeIntel (1) numactl (1) CrayNVHPC (1) LibTIFF (2) SDL2 (2) ecCodes (2) pixman (1) DB (2) M4 (1) SLEPc (2) expat (2) pkgconf (1) DBus (1) MAFFT (1) SPAdes (1) flex (1) tbb (1) ESMF (2) METIS (2) SQLite (2) fontconfig (2) termcap (1) EasyBuild (1) MPC (2) SWIG (3) freetype (2) x264 (1) Eigen (1) MPFR (2) SZ (2) gettext (1) x265 (1) FDS (1) MUMPS (2) SpectrA (1) git-lfs (1) xorg-macros (2) FFmpeg (2) Mako (2) SuiteSparse (2) googletest (1) xprop (2) FastQC (1) Mamba (1) SuperLU_DIST (2) gperf (1) zfp (2) FortranGIS (2) Mesa (2) Tcl (2) gperftools (2) zlib (2) FreeXL (2) Meson (2) Tk (2) groff (2) zstd (3)
2. Building an application software
After an appropriate environment is loaded, this section provides guidelines on how to use it to build an application software on LANTA.
2.1 Compiler wrapper
<wrapper> command | Description | Manual | In substitution for |
| C compiler wrapper |
| mpicc / mpiicc |
| C++ compiler wrapper |
| mpic++ / mpiicpc |
| Fortran compiler wrapper |
| mpif90 / mpiifort |
The Cray compiler wrappers, namely, cc
, CC
and ftn
, become available after loading any PrgEnv-<compiler>
or CPE toolchain. Upon being invoked, the wrapper will pass relevant information about the cray-*
libraries, loaded in the current environment, to the underlying <compiler>
to compile source code. It is recommended to use these wrappers for building MPI applications with the native Cray MPICH library cray-mpich
.
Adding -craype-verbose
to the wrapper when compiling a source file will display the final command executed. To see what will be added before compiling, try <wrapper> --cray-print-opts=all
.
2.2 Build tools
Several tools exist to help us build large and complex programs. Among them, GNU make and CMake are commonly used. The developer team for each software chooses what build tool they will support. Therefore, it is important to thoroughly read the software documentation. For some software, users might need to additionally load the latest CMake or Autotools modules on the system (e.g., module load CMake/3.26.4
).
There are three general stages in building a program using a build tool: configure
, make
and make install
. For more information, see Basic Installation.
Build tools typically detect compilers through environment variables such as CC
, CXX
, and FC
at the configure
stage. Therefore, setting these variables before running configure
should be sufficient to make the tool use the Cray compiler wrappers.
export CC=cc CXX=CC FC=ftn F77=ftn F90=ftn # ./configure --prefix=<your-install-location> ... # or # cmake -DCMAKE_INSTALL_PREFIX=<your-install-location> ...
Nevertheless, if the CMake cache is not clean, you might need to explicitly use:
cmake -DCMAKE_C_COMPILER=cc -DCMAKE_CXX_COMPILER=CC -DCMAKE_Fortran_COMPILER=ftn -DCMAKE_INSTALL_PREFIX=<your-install-location> ...
We encourage users to manually specify the installation path using --prefix=
or -DCMAKE_INSTALL_PREFIX=
as shown above. This path can be within your project home, such as /project/ltXXXXXX-YYYY/<software-name>
, allowing you to manage permissions and share the installed software with your project members. By default on LANTA, your team will be able to read and execute your software but cannot make any changes inside the directory you own.
After these steps, you should be able to execute make
and make install
, then build your software as you would on any other system.
2.3 Related topics
Local module & EasyBuild
A separate page is dedicated for explaining how to use local modules on LANTA → Local module & EasyBuild.
Useful compiler flags
Intel oneAPI (In progress)
3. Running the software
Every main application software must run on compute/gpu/memory nodes. The recommended approach is to write a job script and send it to Slurm scheduler through sbatch
command.
Only use sbatch <job-script>
. If users execute bash <job-script>
or just simply ./<job-script>
, then the script will not get sent to Slurm and will run on frontend node instead!
3.1 Writing a job script
#!/bin/bash #SBATCH -p gpu # Partition #SBATCH -N 1 # Number of nodes #SBATCH --gpus=4 # Number of GPU cards #SBATCH --ntasks=4 # Number of MPI processes #SBATCH --cpus-per-task=16 # Number of OpenMP threads per MPI process #SBATCH -t 5-00:00:00 # Job runtime limit #SBATCH -A ltXXXXXX # Billing account # #SBATCH -J <JobName> # Job name module purge # --- Load necessary modules --- module load <...> module load <...> # --- Add software to Linux search paths --- export PATH=<software-bin-path>:${PATH} export LD_LIBRARY_PATH=<software-lib/lib64-path>:${LD_LIBRARY_PATH} # export PYTHONPATH=<software-python-site-packages>:${PYTHONPATH} # source <your-software-specific-script> # --- (Optional) Set related environment variables --- # export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # MUST specify --cpus-per-task above # --- Run the software --- # srun <srun-options> ./<software> # or # ./<software>
The above job script template consists of five sections:
1. Slurm sbatch header
The #SBATCH
macro can be used to specify sbatch options that mostly unchanged, such as partition, time limit, billing account, and so on. For optional options like job name, users can specify them when submitting the script (see Submitting a job). For more details regarding sbatch options, please visit Slurm sbatch.
It should be noted that Slurm sbatch options only define and request computing resources that can be used inside a job script. The actual resources used by a software/executable can be different depending on how it will be invoked/issued (see Stage 5).
2. Loading modules
It is advised to load every module used when installing the software in the job script, although build dependencies such as CMake, Autotools, and binutils can be omitted. Additionally, those modules should be of the same version as when they were used to compile the program.
3. Adding software paths
The Linux OS will not be able to find your program if it is not in its search paths. The commonly used ones are namely PATH
(for executable/binary), LD_LIBRARY_PATH
(for shared library), and PYTHONPATH
(for python packages). Users MUST append or prepend them using syntax such as export PATH=<software-bin-path>:${PATH}
, otherwise, prior search paths added by module load
and others will disappear.
If <your-install-location>
is where your software is installed, then putting the below commands in your job script should be sufficient in most cases.
export PATH=<your-install-location>/bin:${PATH} export LD_LIBRARY_PATH=<your-install-location>/lib:${LD_LIBRARY_PATH} export LD_LIBRARY_PATH=<your-install-location>/lib64:${LD_LIBRARY_PATH}
Some of them can be omitted if there no such sub-directory when using ls <your-install-location>
.
4. Setting environment variables
Some software requires additional environment variables to be set at runtime; for example, the path to the temporary directory. Parameters set by Slurm sbatch (see Slurm sbatch - output environment variables) could be utilized in setting up software-specific environment variables.
For application with OpenMP threading, OMP_NUM_THREADS
, OMP_STACKSIZE
, ulimit -s unlimited
are commonly set in a job script. An example is shown below.
export XXX_TMPDIR=/scratch/ltXXXXXX-YYYY/${SLURM_JOBID} export OMP_STACKSIZE="32M" export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} ulimit -s unlimited
5. Running your software
Each software has its own command to be issued. Please read the software documentation and forum. Special attention should be paid to how the software recognizes and maps computing resources (CPU-MPI-GPU); occasionally, users may need to insert additional input arguments at runtime. The total resources concurrently utilized in this stage should be less than or equal to the resources previously requested in Stage 1. Oversubscribing resources can reduce overall performance and could cause permanent damage to the hardware.
Usually, either srun
, mpirun
, mpiexec
or aprun
is required to run MPI programs. On LANTA, srun
command MUST be used instead. The table below compares a few options of those commands.
Command | Total MPI processes | CPU per MPI process | MPI processes per node |
---|---|---|---|
srun | -n, --ntasks | -c, --cpus-per-task | --ntasks-per-node |
mpirun/mpiexec | -n, -np | --map-by socket:PE=N | --map-by ppr:N:node |
aprun | -n, --pes | -d, --cpus-per-pe | -N, --pes-per-node |
There is usually no need to add options to srun
since, by default, Slurm will automatically derive them from sbatch
. However, we recommend explicitly adding GPU binding options such as --gpus-per-task
or --ntasks-per-gpu
according to your software specification to srun
. Please visit Slurm srun for more details.
For multi-threaded applications, it is essential to specify -c
or --cpus-per-tasks
options for srun
to prevent a potential decrease in performance (~50%) due to improper CPU binding.
3.2 Submitting a job
To submit your job script (e.g., job-script.sh
) to Slurm, execute sbatch [options] job-script.sh [arguments]
. For example,
username@lanta-xname:~> sbatch job-script.sh # Simplest Submitted batch job XXXXXX1 username@lanta-xname:~> sbatch -J <jobname> job-script.sh # Same as having '#SBATCH -j jobname1' in job-script.sh Submitted batch job XXXXXX2 username@lanta-xname:~> sbatch -D <your-case-directory> job-script.sh Submitted batch job XXXXXX3
If your software asks for a case directory where all inputs must be in, you may need to submit your job inside that directory, or use -D, --chdir
option.
You can test your initial script on compute-devel or gpu-devel partitions, using #SBATCH -t 02:00:00
, since they normally have a shorter queuing time.