Running Atomate2 on DeltaAI

SetupInstallationFollow the setup instructions in ﻿ 🤖⁠⁠NCSA DeltaAI⁠ , specifically with regards to logging in and setting up your ~/.bashrc.
You will have to install your own Anaconda software locally. Download and install  miniconda  or similar. Once installed, make an environment named cms and install the newest versions of the Atomate2 stack:
pip install uv
uv pip install atomate2 jobflow-remote
Setting Up Atomate2Run
pmg config --add PMG_DEFAULT_FUNCTIONAL PBE_64
Make the ~/.atomate2.yaml file as follows:
VASP_CMD: srun vasp_std
VASP_GAMMA_CMD: srun vasp_gam
Now test an Atomate2 job (job.py) with the following Slurm submission script:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1
#SBATCH --mem=90g
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --partition=ghx4
#SBATCH --time=00:30:00
#SBATCH --job-name=test
#SBATCH --account=bems-dtai-gh
﻿
source ~/.bashrc
conda activate cms
module load vasp/6.5.1_gpu
﻿
export MSGSIZE=16777216
export ITERS=400
﻿
python job.py > job.out
In your Maker classes in Atomate2, you can add run_vasp_kwargs = {"custodian_kwargs": {"gzipped_output": True}} to speed up gzipping by having it done in the CUSTODIAN_SCRATCH_DIR instead of the job submission directory.
Jobflow-Remote SetupFollow the instructions at ﻿ 🤖⁠⁠JFR Setup on NCSA DeltaAI⁠ .
Trying Out Atomate2: Now With Jobflow-RemoteNow we will run the Atomate2 script again but with Jobflow-Remote.
First, make sure the Jobflow-Remote daemon is running in the background by doing jf runner start.
In your Atomate2 code, you will replace the run_locally command with submit_flow. Basically, it will look like your normal code with run_locally swapped out for submit_flow and the name of the worker:
from jobflow_remote import submit_flow
﻿
...
﻿
submit_flow(flow, worker="basic_vasp")
Then submit the flow by running the Python script (e.g. python job.py) from the login node.
Let Jobflow-Remote do the rest and monitor things with jf job list and squeue as needed.
RunningTo run calculations for real, you will want to increase the time set in the ~/.jremote/cms.yaml file. The YAML is currently set to run in batch mode, such that the Slurm job will continually pull in new work when a calculation finishes until the walltime is reached. If you prefer to only have one calculation run per Slurm job, remove the batch section in the YAML.
There are a few things that you should regularly check over the course of the campaign:
The remaining GPU hours, which you can check via accounts
The remaining storage space, which you can check via quota
If using Jobflow-Remote with MongoDB Atlas, the amount of storage space left on our MongoDB Atlas cluster and the number of operations per second. You can view this on the MongoDB Atlas webpage under Database > Clusters (R/W and Data Size).
﻿