Overview
The primary type of database we use to store results in our group is MongoDB . When running high-throughput calculations, using a database like MongoDB makes storing, querying, and analyzing data much easier and more reproducible. Resources
In addition to Studio3T, which is a GUI, we use maggma as a Pythonic way to store and query results in our databases as needed. Data Transformations
If you want to carry out complex data transformations on your database, there are two routes:
Use maggma
to fetch the data, transform it into a pandas
dataframe, and then carry out your desired transformations.
Use a dataflow program, such as Mage , to construct and orchestrate the data pipeline.
For relatively simple tasks, maggma
and a Jupyter Notebook will typically get the job done just fine. For more complex tasks, you may wish to consider a data pipeline program like Mage.