GROMACS (GROningen MAchine for Chemical Simulations) is a molecular dynamics simulation package.
...
updated: May 2023
...
Table of Contents |
---|
...
Available version
Version | Module name | Thread MPI (single node or GPU) | MPI (multi-node) |
---|---|---|---|
2022.5 | GROMACS/2022.5-GNU-11.2-CUDA-11.7 | gmx mdrun | gmx_mpi mdrun |
2023.2 | GROMACS/2023.2-GNUcpeGNU-1123.209-CUDA-1112.70 | gmx mdrun | gmx_mpi mdrun (see note [a]) |
Note:
[a] GROMACS/2023.2-GNUcpeGNU-1123.209-CUDA-1112.7 0 MPI does NOT have PME GPU decomposition feature. This does not affect the normal usage of GROMACS unless you have very large system (i.e. >10 M stomsatoms), please see Massively Improved Multi-node NVIDIA GPU Scalability with GROMACS | NVIDIA Technical Blog for more details about PME GPU decomposition feature.
1. Input file
The input file of GROMACS mdrun
command is TPR file (.tpr). For an example TPR files, you can see https://www.mpinat.mpg.de/grubmueller/bench where GROMACS intensive benchmark sets are provided.
2. Job submission script
create a script using vi submit.sh
command and specify the following details depending on computational resources you want to use.
2.1 using compute node (1 node)
Code Block |
---|
#!/bin/bash #SBATCH -p compute #specific partition #SBATCH -N 1 -c 128 #specific number of nodes and taskcores per nodetask #SBATCH -t 5-00:00:00 #job time limit <hr:min:sec> #SBATCH -A lt999999 #project name #SBATCH -J GROMACS #job name ##Module Load## module restore module load GROMACS/2022.5-GNU-11.2-CUDA-11.7 gmx mdrun -deffnm input |
The script above using compute partition (-p compute
), 1 node (-N 1
) with 128 cores per task (-c 128
) for 1 task (default). The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999
) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS
).
Info |
---|
To specify computing resource, change the number of cores at the |
2.2 using compute node (>1 node)
Code Block |
---|
#!/bin/bash #SBATCH -p compute #specific partition #SBATCH -N 4 --ntasks-per-node=64 -c 2 #specific number of nodes, task per node, and taskcores per nodetask #SBATCH -t 5-00:00:00 #job time limit <hr:min:sec> #SBATCH -A lt999999 #project name #SBATCH -J GROMACS #job name ##Module Load## module restore module load GROMACS/2022.5-GNU-11.2-CUDA-11.7 srun -c $SLURM_CPUS_PER_TASK gmx_mpi mdrun -deffnm input -ntomp $SLURM_CPUS_PER_TASK |
The script above using compute partition (-p compute
), 4 node (-N 4
) with 2 cores per task (-c 2
) for 64 tasks per node (--ntasks-per-node=64
). This result in 2 x 64= 128 cores per node and 4 x 128= 512 cores in total. The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999
) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS
).
...
Expand | ||
---|---|---|
| ||
One can tune GROMACS performance by adjust the number of MPI rank ( |
2.3 using GPU node (1 card)
Code Block |
---|
#!/bin/bash #SBATCH -p gpu #specific partition #SBATCH -N 1 --ntasks-per-node=1 -c 16 #specific number of nodes, task per node, and taskcores per nodetask #SBATCH --gpus-per-task=1 #specific number of GPU per task #SBATCH -t 5-00:00:00 #job time limit <hr:min:sec> #SBATCH -A lt999999 #project name #SBATCH -J GROMACS #job name ##Module Load## module restore module load GROMACS/2022.5-GNU-11.2-CUDA-11.7 gmx mdrun -deffnm input -update gpu |
The script above using gpu partition (-p gpu
), 1 node (-N 1
) with 16 cores per task (-c 16
) and 1 GPU card per task (--gpus-per-task=1
) for 1 tasks per node (--ntasks-per-node=1
). This result in 1 x 16= 16 cores with 1 x 1= 1 GPU card. The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999
) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS
).
2.4 using GPU node (>1 cards)
Code Block |
---|
#!/bin/bash #SBATCH -p gpu #specific partition #SBATCH -N 1 --ntasks-per-node=4 -c 16 #specific number of nodes, task per node, and taskcores per nodetask #SBATCH --gpus-per-task=1 #specific number of GPU per task #SBATCH -t 5-00:00:00 #job time limit <hr:min:sec> #SBATCH -A lt999999 #project name #SBATCH -J GROMACS #job name ##Module Load## module restore module load GROMACS/2022.5-GNU-11.2-CUDA-11.7 export#export GMX_GPU_DD_COMMS=true export#export GMX_GPU_PME_PP_COMMS=true export GMX_ENABLE_DIRECT_GPU_COMM=true gmx mdrun -deffnm input -update gpu -nb gpu -bonded gpu -pme gpu -ntmpi 8 -ntomp 8 -npme 1 |
The script above using gpu partition (-p gpu
), 1 node (-N 1
) with 16 cores per task (-c 16
) and 1 GPU card per task (--gpus-per-task=1
) for 1 tasks per node (--ntasks-per-node=4
). This result in 4 x 16= 64 cores with 4 x 1= 4 GPU card. The wall-time limit is set to 5 days (-t 5-00:00:00) which is the maximum. The account is set to lt999999 (-A lt999999
) that is subjected to change to your own account. The job name is set to GROMACS (-J GROMACS
).
Note: Two environment variables (line 13, 14) are set to enable GPU direct communication. see Massively Improved Multi-node NVIDIA GPU Scalability with GROMACS | NVIDIA Technical Blog for more detail.
In GROMACS 2022 and 2023, the GMX_GPU_DD_COMMS
and GMX_GPU_PME_PP_COMMS
are removed, please use GMX_ENABLE_DIRECT_GPU_COMM
instead, see Environment Variables — GROMACS 2022 documentation and Environment Variables - GROMACS 2023 documentation for detail.
Info |
---|
To specify computing resource, change the number of tasks per node ( |
3. Job submission
using sbatch submit.sh
command to submit the job to the queuing system.
...