Job is pending with "ReqNodeNotAvail, Reserved for maintenance"

Problem

LANTA is NOT in maintenance, but job(s) is pending with the reason of “ReqNodeNotAvail, Reserved for maintenance”.

Solution

  1. Cancel the job(s) [scancel <jobID>]

  2. Reduce walltime limit (-t) in your submission script and resubmit the job(s).

The time when you submitted your job plus walltime limit shouldn’t excess the starting time of maintenance. For example,

maintenance date: 12 April 2024 08:00

submitted date: 8 April 2024 17:00

walltime limit: less than 3-15:00:00 otherwise job is in pending state.

The pending jobs with “ReqNodeNotAvail, Reserved for maintenance” will automatically run when the node(s) is available. No additional action is required if you do not intend to run job(s) before the maintenance period.

Explanation

To ensure a smooth maintenance process, Slurm clears all running jobs. As a result, it restricts job submissions that would extend beyond the maintenance period.