Submitting Jobs

Slurm

Overview and Guides

Slurm is the job scheduling system on the Princeton HPC machines and most clusters we use. Very useful about Slurm can be found in the following guide by Princeton Research Computing:

Submission Scripts

Write the Slurm files below as submit.job and submit them as sbatch submit.job. You can check out your queued jobs using squeue -u <NetID>. To cancel a job, run scancel <JobID>. To cancel all your jobs, run scancel -u <NetID>. If you prefer to run an interactive session, you can use salloc as described in the Princeton Research Computing  KnowledgeBase article .

Tiger3

Python

The following is a typical submit.job file for a serial Python calculation.
#!/bin/bash
#SBATCH --job-name=python # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks-per-node=1 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=4G # memory (up to 1 TB per node)
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup

source ~/.bashrc
module purge
module load anaconda3/2024.10
conda activate cms

python job.py > job.out

VASP

To run VASP, we modify the Slurm submission script so that we load the necessary modules and run the VASP executable via the srun command. Looking for an example to run? Check out the  VASP tutorials .
#!/bin/bash
#SBATCH --job-name=vasp # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks-per-node=112 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=512G # memory (up to 1 TB per node)
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup

source ~/.bashrc
module purge
module load vasp/6.5.0

srun vasp_std > vasp.out # or vasp_gam for 1x1x1 kpoints

ASE w/ VASP

Running a VASP calculation via ASE works essentially the same way as running VASP directly except now you call a Python script and define the VASP parallelization flags by setting the ASE_VASP_COMMAND environment variable as defined in the  ASE documentation . Looking for an example ASE calculation to run? Refer to  🦾Using ASE to run VASP .
#!/bin/bash
#SBATCH --job-name=vasp # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks-per-node=112 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=512G # memory (up to 1 TB per node)
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup

source ~/.bashrc

module purge
module load anaconda3/2024.10
module load vasp/6.5.0
conda activate cms

export ASE_VASP_COMMAND="srun vasp_std" # or "srun vasp_gam" for 1x1x1 kpts

python job.py > vasp.out

Quacc w/ VASP

The following is a submit script to run VASP with  quacc . Looking for an example quacc calculation to run? Refer to  🤖Automation with Quacc .
#!/bin/bash
#SBATCH --job-name=vasp # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks-per-node=112 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=512G # memory (up to 1 TB per node)
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup

source ~/.bashrc
module purge
module load anaconda3/2024.10
module load vasp/6.5.0
conda activate cms

export QUACC_VASP_PARALLEL_CMD="srun -N 1 --ntasks-per-node 112" # should typically match Slurm directives

python job.py

Della

Submitting jobs on Della works essentially the same way as on Tiger except that you should not use the --account=rosengroup flag since we do not currently have a special account on Della. The CPU nodes on Della are also different, so you may need to modify the number of cores you're requesting if using a CPU. There are also GPU nodes, which require adding --gres=gpu:1 to your submission script to request 1 GPU node.

Python

CPU Tasks

Submitting a CPU-based serial Python job works exactly the same way as on Tiger. Simply remove the --account flag. Note that different CPU architectures have different cores per node on Della. Refer to the "Hardware Configuration" in the  Della documentation  for details. You can use the --constraint flag to make sure you land on a specific type of hardware if desired.
#!/bin/bash
#SBATCH --job-name=python # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks-per-node=1 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=8G # memory (up to 1 TB per node)
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)

source ~/.bashrc
module purge
module load anaconda3/2024.10
conda activate cms

python job.py > job.out

GPU Tasks

Submitting a GPU-based Python job (e.g. for ML) requires adding the --gres flag. Note that different GPUs have different CPU architectures and therefore different values to use for --ntasks-per-node. See the "Hardware Configuration" in the  Della documentation  for details.
#!/bin/bash
#SBATCH --job-name=python # create a short name for your job
#SBATCH --gres=gpu:1 # GPU node count
#SBATCH --nodes=1 # CPU node count
#SBATCH --ntasks-per-node=20 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=8G # memory
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)

source ~/.bashrc
module purge
module load anaconda3/2024.10
conda activate cms

python job.py > job.out

VASP

The GPU version of VASP can be run as follows:
#!/bin/bash
#SBATCH --job-name=vasp # create a short name for your job
#SBATCH --gres=gpu:1 # GPU node count
#SBATCH --nodes=1 # CPU node count
#SBATCH --ntasks-per-node=1 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=32G # memory
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)

source ~/.bashrc
module purge
module load vasp/6.5.0_gpu

srun vasp_std > vasp.out # or vasp_gam for 1x1x1 kpoints

Neuronic

Submitting jobs on Neuronic works pretty similarly to other Slurm clusters on campus. The main difference is that some of the modules may be different or absent, and you will want to specify the number of GPUs via the --gres flag.

Python

CPU Tasks

The following is a typical submission script for a CPU-based Python job.
#!/bin/bash
#SBATCH --job-name=python # create a short name for your job
#SBATCH --nodes=1 # CPU node count
#SBATCH --ntasks-per-node=1 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=4G # memory
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)

source ~/.bashrc
module purge
module load anaconda3/2024.02
conda activate cms

python job.py > job.out

GPU Tasks

The following is a typical submission script for a GPU-based Python job (e.g. for ML).
#!/bin/bash
#SBATCH --job-name=python # create a short name for your job
#SBATCH --gres=gpu:1 # GPU node count
#SBATCH --nodes=1 # CPU node count
#SBATCH --ntasks-per-node=1 # total number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=4G # memory
#SBATCH --time=00:10:00 # total run time limit (HH:MM:SS)

source ~/.bashrc
module purge
module load anaconda3/2024.02
conda activate cms

python job.py > job.out