Submitting Jobs

Tip: Want to know where an active Slurm job is running? Run scontrol show job <JobID> to find out!
SlurmOverview and GuidesSlurm is the job scheduling system on the Princeton HPC machines and most clusters we use. Very useful about Slurm can be found in the following guide by Princeton Research Computing:
﻿
Slurm
https://researchcomputing.princeton.edu/support/knowledge-base/slurm
Add a caption...
Submission ScriptsFor a description on where to submit calculations on the various supercomputers, please refer to ﻿ 📁⁠⁠Filesystems⁠ .
Write the Slurm files below as submit.job and submit them as sbatch submit.job. You can check out your queued jobs using squeue -u <NetID>. To cancel a job, run scancel <JobID>. To cancel all your jobs, run scancel -u <NetID>. If you prefer to run an interactive session, you can use salloc as described in the Princeton Research Computing  KnowledgeBase article .
TigerPythonThe following is a typical submit.job file for a serial Python calculation.
#!/bin/bash
#SBATCH --job-name=python        # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks-per-node=1      # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=8G                 # memory (up to 1 TB per node)
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup
﻿
source ~/.bashrc
module purge
module load anaconda3/2024.10
conda activate cms
﻿
python job.py > job.out
VASPTo run VASP, we modify the Slurm submission script so that we load the necessary modules and run the VASP executable via the srun command. Looking for an example to run? Check out the  VASP tutorials . The files for the Si example is reproduced below.
#!/bin/bash
#SBATCH --job-name=vasp          # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks-per-node=112    # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=900G               # memory (up to 1 TB per node)
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup
﻿
source ~/.bashrc
module purge
module load vasp/6.5.1
﻿
srun vasp_std > vasp.out # or vasp_gam for 1x1x1 kpoints
﻿
INCAR201Bytes, Uploaded 4 months ago
Add a caption...
﻿
KPOINTS46Bytes, Uploaded 4 months ago
Add a caption...
﻿
POSCAR367Bytes, Uploaded 4 months ago
Add a caption...
﻿
POTCAR153KB, Uploaded 4 months ago
Add a caption...
ASE w/ VASPRunning a VASP calculation via ASE works essentially the same way as running VASP directly except now you call a Python script and define the VASP parallelization flags by setting the ASE_VASP_COMMAND environment variable as defined in the  ASE documentation . Looking for an example ASE calculation to run? Refer to ﻿ 🦾⁠⁠Using ASE to run VASP⁠ .
#!/bin/bash
#SBATCH --job-name=vasp          # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks-per-node=112    # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=900G               # memory (up to 1 TB per node)
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup
﻿
source ~/.bashrc
﻿
module purge
module load anaconda3/2024.10
module load vasp/6.5.1
conda activate cms
﻿
export ASE_VASP_COMMAND="srun vasp_std"  # or "srun vasp_gam" for 1x1x1 kpts
﻿
python job.py > vasp.out
Quacc w/ VASPThe following is a submit script to run VASP with  quacc . Looking for an example quacc calculation to run? Refer to ﻿ 🤖⁠⁠Automation with Quacc⁠ . 
#!/bin/bash
#SBATCH --job-name=vasp          # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks-per-node=112    # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=900G               # memory (up to 1 TB per node)
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup
﻿
source ~/.bashrc
module purge
module load anaconda3/2024.10
module load vasp/6.5.1
conda activate cms
﻿
export QUACC_VASP_PARALLEL_CMD="srun --nodes 1 --ntasks-per-node 112"  # change as needed; resources for each job
﻿
python job.py
StellarStellar works in a nearly identical manner as Tiger. The only major differences in terms of job submission are the hardware details:
You will need to set  --ntasks-per-node=96 when using the full node since the CPU nodes are different.
The maximum memory to set with --mem will need to be below 720 GB since this is the maximum memory per node.
Our account is --account=cbe.
For additional details, refer to the Stellar  system documentation .
DellaSubmitting jobs on Della works essentially the same way as on Tiger except that you should not use the --account=rosengroup flag since we do not currently have a special account on Della. The CPU nodes on Della are also different, so you may need to modify the number of cores you're requesting if using a CPU. There are also GPU nodes, which require adding --gres=gpu:1 to your submission script to request 1 GPU node.
PythonCPU TasksSubmitting a CPU-based serial Python job works exactly the same way as on Tiger. Note that different CPU architectures have different cores per node on Della. Refer to the "Hardware Configuration" in the  Della documentation  for details. You can use the --constraint flag to make sure you land on a specific type of hardware if desired.
#!/bin/bash
#SBATCH --job-name=python        # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks-per-node=1      # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=4G                 # memory
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup
﻿
source ~/.bashrc
module purge
module load anaconda3/2024.10
conda activate cms
﻿
python job.py > job.out
GPU TasksSubmitting a GPU-based Python job (e.g. for ML) requires adding the --gres flag. Note that different GPUs have different CPU architectures and therefore different values to use for --ntasks-per-node. See the "Hardware Configuration" in the  Della documentation  for details.
#!/bin/bash
#SBATCH --job-name=python        # create a short name for your job
#SBATCH --gres=gpu:1             # GPU node count
#SBATCH --nodes=1                # CPU node count
#SBATCH --ntasks-per-node=20     # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=36G                # memory
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
#SBATCH --account=rosengroup
﻿
source ~/.bashrc
module purge
module load anaconda3/2024.10
conda activate cms
﻿
python job.py > job.out
VASPThe parallelization settings should be tested first in a scaling analysis since that has not been done yet on Della.
The GPU version of VASP can be run as follows.
#!/bin/bash
#SBATCH --job-name=vasp          # create a short name for your job
#SBATCH --gres=gpu:1             # GPU node count
#SBATCH --nodes=1                # CPU node count
#SBATCH --ntasks-per-node=1      # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=36G                # memory
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
#SBATCH --gpu-bind=none
#SBATCH --account=rosengroup
﻿
source ~/.bashrc
module purge
module load vasp/6.5.1_gpu
﻿
srun vasp_std > vasp.out  # or vasp_gam for 1x1x1 kpoints
NeuronicSubmitting jobs on Neuronic works pretty similarly to other Slurm clusters on campus. The main difference is that some of the modules may be different or absent, and you will want to specify the number of GPUs via the --gres flag.
PythonCPU TasksThe following is a typical submission script for a CPU-based Python job.
#!/bin/bash
#SBATCH --job-name=python        # create a short name for your job
#SBATCH --nodes=1                # CPU node count
#SBATCH --ntasks-per-node=1      # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=4G                 # memory
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
﻿
source ~/.bashrc
module purge
module load anaconda3/2024.02
conda activate cms
﻿
python job.py > job.out
GPU TasksThe following is a typical submission script for a GPU-based Python job (e.g. for ML).
#!/bin/bash
#SBATCH --job-name=python        # create a short name for your job
#SBATCH --gres=gpu:1             # GPU node count
#SBATCH --nodes=1                # CPU node count
#SBATCH --ntasks-per-node=1      # total number of tasks per node
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem=4G                 # memory
#SBATCH --time=00:10:00          # total run time limit (HH:MM:SS)
﻿
source ~/.bashrc
module purge
module load anaconda3/2024.02
conda activate cms
﻿
python job.py > job.out
NSF and DOE MachinesRefer to the corresponding machine in ﻿ ☄️⁠⁠National Computing Resources⁠ .