librascal: Representations for atomic-scale learning.
maml: A python package for materials machine learning.
Chemiscope: Graphical tool for the interactive exploration of materials and molecular databases
Frameworks
General-Purpose
The following are useful frameworks for training machine learning models:
sklearn: The go-to standard for "conventional" (i.e. not graph-related) classification, regression, and clustering machine learning algorithms. This is the package you should start with when getting familiar with machine learning.
PyTorch: A Python library for training and running neural networks. This is the deep learning framework we use in the group.
DGL: A library for doing deep learning on graphs, which is framework-agnostic (i.e. it is interoperable with PyTorch, TensorFlow, or MXNet). This is an excellent resource if you are building new deep learning models.
fastai: Basically a wrapper around PyTorch that simplifies the deep learning process for several application areas.
PySR: Symbolic regression, for when having an equation is desirable.
e3nn: Python package for constructing Euclidean neural networks.
Interatomic Potentials
Toolkits
Autoplex: Automated pipeline to train machine learned interatomic potentials.
DeePMD-kit: Package to train deep learning interatomic potentials for MD.
Psiflow: An end-to-end framework for developing interatomic potentials, built around Parsl.
NequIP: A library for building E(3)-equivariant interatomic potentials.
FLARE: An open-source Python package for creating fast and accurate interatomic potentials.
Pre-Trained
Refer to Matbench Discovery for an up-to-date ranking of many pre-trained interatomic potentials.
matgl: Materials graph library for the models M3GNet and MEGNet
Allegro: A library for building equivariant interatomic potentials that is an extension of NequIP.
AIRS: Several deep learning model architectures for the chemistry space.
The field of machine learning for materials chemistry moves fast. Some of these packages might become outdated quickly. Feel free to remove some and add new ones as you see fit. If a package is no longer actively maintained, it should likely be removed here.
Miscellaneous Tools
The following are miscellaneous machine learning packages that aren't necessarily domain-specific:
Marvin: Build natural language processing interfaces in practical applications.
Colmena: Library for steering campaigns of simulations on supercomputers
meerkat: A Python package to more easily view/annotate unstructured data.
SegmentAnything: A model to perform segmentation analysis on images.