Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

GROMACS (GROningen MAchine for Chemical Simulations) is a molecular dynamics simulation package.

Official website : https://www.gromacs.org/

updated: May 2023



 Available version

Version

Module name

Thread MPI

(single node or GPU)

MPI (multi-node)

2022.5

GROMACS/2022.5-GNU-11.2-CUDA-11.7

gmx mdrun

gmx_mpi mdrun

2023.2

GROMACS/2023.2-GNU-11.2-CUDA-11.7

gmx mdrun

gmx_mpi mdrun (see note [a])

Note:

[a] GROMACS/2023.2-GNU-11.2-CUDA-11.7 MPI does NOT have PME GPU decomposition feature. This does not affect the normal usage of GROMACS unless you have very large system (i.e. >10 M stoms), please see Massively Improved Multi-node NVIDIA GPU Scalability with GROMACS | NVIDIA Technical Blog for more details about PME GPU decomposition feature.

1. Input file

The input file of GROMACS mdrun command is TPR file (.tpr). For an example TPR files, you can see https://www.mpinat.mpg.de/grubmueller/bench where GROMACS intensive benchmark sets are provided.

2. Job submission script

create a script using vi submit.sh command and specify the following details depending on computational resources you want to use.

2.1 using compute node (1 node)

#!/bin/bash
#SBATCH -p compute      	#specific partition
#SBATCH -N 1 -c 128         #specific number of nodes and task per node
#SBATCH -t 5-00:00:00       #job time limit <hr:min:sec>
#SBATCH -A lt999999         #project name
#SBATCH -J GROMACS      	#job name

##Module Load##
module restore
module load GROMACS/2022.5-GNU-11.2-CUDA-11.7

gmx mdrun -deffnm input 

The script above using compute partition (-p compute), 1 node (-N 1) with 128 cores per task (-c 128) for 1 task (default). The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS ).

To specify computing resource, change the number of cores at the -c option: full node (-c 128), half-node (-c 64), 1/4-node (-c 32)

2.2 using compute node (>1 node)

#!/bin/bash
#SBATCH -p compute      		         #specific partition
#SBATCH -N 4 --ntasks-per-node=64 -c 2 	 #specific number of nodes and task per node
#SBATCH -t 5-00:00:00        		     #job time limit <hr:min:sec>
#SBATCH -A lt999999             		     #project name
#SBATCH -J GROMACS      		         #job name

##Module Load##
module restore
module load GROMACS/2022.5-GNU-11.2-CUDA-11.7

srun -c $SLURM_CPUS_PER_TASK gmx_mpi mdrun -deffnm input -ntomp 2

The script above using compute partition (-p compute), 4 node (-N 4) with 2 cores per task (-c 2) for 64 tasks per node (--ntasks-per-node=64). This result in 2 x 64= 128 cores per node and 4 x 128= 512 cores in total. The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS ).

To specify computing resource, change the number of nodes at the -N option: 2 nodes (-N 2), 3 node (-N 3), 4-node (-N 4), and keep the others options same as the above template.

 technical for advance user

One can tune GROMACS performance by adjust the number of MPI rank (-ntmpi) and number of cores in rank (-ntomp). The -ntmpi matches with slurm total number of tasks (-n or --ntask-per-node multiply by -N) and -ntomp matches with slurm cores per task (-c, --cpus-per-task).

2.3 using GPU node (1 card)

#!/bin/bash
#SBATCH -p gpu                          #specific partition
#SBATCH -N 1 --ntasks-per-node=1 -c 16  #specific number of nodes and task per node
#SBATCH --gpus-per-task=1		        #specific number of GPU per task
#SBATCH -t 5-00:00:00                   #job time limit <hr:min:sec>
#SBATCH -A lt999999                       #project name
#SBATCH -J GROMACS                   	#job name

##Module Load##
module restore
module load GROMACS/2022.5-GNU-11.2-CUDA-11.7

gmx mdrun -deffnm input -update gpu

The script above using gpu partition (-p gpu), 1 node (-N 1) with 16 cores per task (-c 16) and 1 GPU card per task (--gpus-per-task=1) for 1 tasks per node (--ntasks-per-node=1). This result in 1 x 16= 16 cores with 1 x 1= 1 GPU card. The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS ).

2.4 using GPU node (>1 cards)

#!/bin/bash
#SBATCH -p gpu                          	#specific partition
#SBATCH -N 1 --ntasks-per-node=4 -c 16  	#specific number of nodes and task per node
#SBATCH --gpus-per-task=1		            #specific number of GPU per task
#SBATCH -t 5-00:00:00                     	#job time limit <hr:min:sec>
#SBATCH -A lt999999                       	#project name
#SBATCH -J GROMACS                   	    #job name

##Module Load##
module restore
module load GROMACS/2022.5-GNU-11.2-CUDA-11.7

export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true

gmx mdrun -deffnm input -update gpu -nb gpu -bonded gpu -pme gpu -ntmpi 8 -ntomp 8 -npme 1

The script above using gpu partition (-p gpu), 1 node (-N 1) with 16 cores per task (-c 16) and 1 GPU card per task (--gpus-per-task=1) for 1 tasks per node (--ntasks-per-node=4). This result in 4 x 16= 64 cores with 4 x 1= 4 GPU card. The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS ).

Note: Two environment variables (line 13, 14) are set to enable GPU direct communication. see Massively Improved Multi-node NVIDIA GPU Scalability with GROMACS | NVIDIA Technical Blog for more detail.

To specify computing resource, change the number of tasks per node (--ntasks-per-node) to the number of GPU cards you want to use and change -ntmpi and -ntomp to match with the total number of CPU cores. The total number of CPU equals to --ntasks-per-node multiply by -c , e.g. 4 x 16 = 64 in this case, therefore, -ntmpi is set to 8 and -ntomp is set to 8 (8 x8 = 64).

3. Job submission

 using sbatch submit.sh command to submit the job to the queuing system.


Filter by label

There are no items with the selected labels at this time.

  • No labels