The primary type of database we use to store results in our group is MongoDB . When running high-throughput calculations, using a database like MongoDB makes storing, querying, and analyzing data much easier and more reproducible.
In addition to Studio3T, which is a GUI, we use maggma as a Pythonic way to store and query results in our databases as needed. If you want to carry out complex data transformations on your database, there are two routes:
.1Use maggma
to fetch the data, transform it into a pandas
dataframe, and then carry out your desired transformations.
.2Use a dataflow program, such as Mage , to construct and orchestrate the data pipeline.
For relatively simple tasks, maggma
and a Jupyter Notebook will typically get the job done just fine. For more complex tasks, you may wish to consider a data pipeline program like Mage, but that in general is rarely necessary.