Oftentimes you may find yourself running some Python code where there are expensive function calls that you would prefer to parallelize. For instance, perhaps you have a big dataframe and are running an expensive operation on each row. Rather than iterating through each row serially, it would be better to distribute this work in parallel over multiple compute cores.
For this kind of task parallelization, we currently recommend using 🐸Dask , although 🌀Parsl will work just as well too. Dask is a popular Python library (such that large language models can provide more assistance if needed), although Parsl is more customizable for usage on compute clusters. In both cases, you can parallelize tasks over multiple cores whether it be in a Slurm allocation or on your local machine.