The primary type of database we use to store results in our group is MongoDB . When running high-throughput calculations, using a database like MongoDB makes storing, querying, and analyzing data much easier and more reproducible. Ask Andrew to make you a MongoDB database. Then, store the credentials he gives you. Instructions below are for Andrew.
- Log into Studio 3T as
superman to the admin DB. Select Add Database and use the NetID of the desired user as the DB name. Click on the newly created DB and select Users. Add a user with the same NetID and a custom password, granting them a DB Owner role. Connect via the newly created user and DB. Add a New Collection so the DB is persistent.
To query the MongoDB collection from Python, you can use maggma . For instance, from the login node, compute node, or visualization node: from maggma.stores import MongoStore
store = MongoStore(
host="localhost",
database="MyDBName",
username="MyDBUserName",
password="MyDBPassword",
collection_name="MyDBCollectionName"
)
with store:
print(store.count())
To query a given database, simply use store.query({"key_name": "value"}) (store is the variable name of the database here), which returns a generator that has all entries in the database that match your query. A for loop must be used to iterate over the generator, from which specific values of the entry can be extracted.
with store:
results = list(store.query({"name": "relax_jobpbe"}))
all_energies = []
for result in results:
nrg = result["output"]["output"]["energy"]
all_energies.append(nrg)
print(all_energies)
Princeton Research Computing has enabled automatic backups of the MongoDB instance on the tiger-arrk login node. However, you may choose to occasionally make backups yourself. This can be done using most available GUIs.