Here we discuss how to use the Cambridge Structural Database (CSD). The CSD contains experimental crystal structures. Note that the CSD crystal structures are proprietary and should not be uploaded to public repositories.
Accessing the Web Interface
You can download CIF files from the CCDC website. You can also search your structure by name or paper DOI. Once you've picked a structure, download the current entry as a deposited CIF file.
ConQuest and the Python API
See the group Dropbox for the customer number and activation key to download the CSD Portfolio, which includes software like ConQuest that can programmatically query the CSD as well as the Python API for more advanced querying and analysis. If the license is outdated, contact the appropriate Princeton University library staff member for details and update Slite.
Structure Cleanup
Many of the structures on the CSD have disorder (i.e. fractional occupancies), may be missing hydrogen atoms, or other subtleties that need to be considered before running DFT calculations on them. Please keep this in mind and tread carefully.
Adding H Atoms
Many crystal structures are missing H atoms because they are difficult to resolve crystallographically. To add hydrogen atoms automatically, refer to the corresponding section in 🧰Structure Visualization and Editing.
Solvent Removal
There are several tools to remove solvent from crystal structures. First, note that there are two types of solvent: free solvent and bound solvent. The former are free-floating solvent molecules in the structure, whereas bound solvent is chemically bound to a metal center in the structure. Generally speaking, it is wise to remove the free solvent before running a DFT calculation since it is typically fairly trivial to "activate" the material and remove free solvent experimentally. Whether you should remove the bound solvent is more open to debate since it may not always be able to be removed experimentally and its removal could, in some cases, impact the stability of the structure.
Removal of solvent is generally done using the CSD Python API. One way to do this is through the use of SAMOSA. Another way is with MOFStructure.
If the above options do not work for you, you can find some simple scripts below using the CSD Python API:
To check the validity of a MOF crystal structure, you may consider the following tools. None are perfect, and you will have to do some manual inspection to make sure the filters are doing what you want. MOFChecker is based on simple heuristics, which you can toggle. SETC-GAT and MOFClassifier are ML-based models.
Note that while some structures on the CSD have labeled oxidation states, many are missing oxidation state labels. There are now several machine learning tools to predict oxidation states, if needed. For MOFs, for instance, there is the oximachinerunner and MOSAEC.