Data Backups

TigerDataYou can view the TigerData quota/usage at  https://tigerdata-app.princeton.edu .
As described in ﻿ 📁⁠⁠Filesystems⁠ , we use the  TigerData  service to back up data on the Princeton Research Computing clusters. A mount point for TigerData exists on the major clusters we use. While you can copy data manually to the TigerData mount point, some best practices are outlined below.
TigerIt is ultimately your responsibility to ensure that data is being backed up as expected. Regularly — and especially after machine downtime — you should ensure that your data continues to be backed up as you expect.
Setting up a Syncovery ProfileYou can duplicate a pre-existing Syncovery profile by right-clicking it, changing the name, and then clicking OK. This is the recommended approach. Make sure to still update any filters!
On Tiger, we have a local Syncovery server running at /home/ROSENGROUP/software/syncovery. This service can be used to transfer data between /scratch/gpfs and /tigerdata on a scheduled basis. Everyone in the group should be making their own backup profile(s) using this Syncovery service. To do so:
Open up a  Virtual Desktop session  via the web GUI on our shared login node.
When on the virtual desktop, open Firefox in the virtual desktop environment
Go to  http://localhost:8999  in the Firefox browser
Duplicate a pre-existing Syncovery profile to have a good starting point by right-clicking it, changing the name, and then clicking OK.
Edit your new Syncovery profile as follows:
Put the data source location (e.g. /scratch/gpfs/ROSENGROUP/<NetID>) as the left-hand path and the data destination location (e.g. /tigerdata/ROSENGROUP/work/tiger/<NetID>) as the right-hand path.
Under the "Schedule" tab, make sure to "Schedule This Profile" and have the profile run on a regular basis (e.g. every day at 00:00:00). To reduce burden on the login node when people are using, schedule your backup service at night (e.g. 1 am — 6 am) and try to stagger your schedule time from other backup profiles if possible. It's recommended that you set the scheduled time to the night time when few people are using the tiger-arrk login node.
The "Masks/Filters" section is worth paying attention to. This is where you can exclude certain file types or folders from being backed up. For instance, under exclusion masks you may (or may not) include files like: WAVE*, CHG*, AECC*, ELF*, or other large data files if you don't need them backed up. You can also include folder names (e.g. software or trash or nobackup) where any data in the specified folders would not be backed up. Please do not back up software — only important research files.
IMPORTANT: Under "Compress/Encrypt", be sure to select "Compress Each File Individually" in the "Zipping" tab to save space
Save your profile with "OK"
Test out your profile by clicking on it and then selecting "Run With Preview" from the top toolbar. This will show you what will be transferred and request your approval before doing the sync. It is good for troubleshooting. Run it once to make sure it works.
You can make as many profiles as you'd like, with different rules for each if you so choose.
DellaYou can duplicate a pre-existing Syncovery profile by right-clicking it, changing the name, and then clicking OK. This is the recommended approach.
For Della, you can follow largely the same instructions as above except that the left-hand path and right-hand paths will be different. Replace Step 5 above with the following:
For the left-hand side, side select the "Internet" tab.
In the Internet Protocol Settings window that opens, set the Protocol as SSH.
Under the Settings tab, set the URL as sftp://della.princeton.edu and the User ID as asrosen (don't use your ID here). Then select the Folder that you want to sync (e.g. your /scratch/gpfs/ROSENGROUP/<NetID> directory).
Under the Security tab, set the Private Key for User Authentication as Tiger-ARRK and check the "No password needed" box.
Do the same process for the right-hand side, setting the data destination location to the appropriate location in /tigerdata (e.g. /tigerdata/ROSENGROUP/work/della/<NetID>).
Make sure to follow the rest of the steps as usual, such as defining the Masks/Filters and setting the Compress/Encrypt settings.
Admin of the Syncovery Server on TigerTo view the logs for a given run (e.g. for debugging), go to /home/ROSENGROUP/software/syncovery/.Syncovery/Logs
For administrative purposes, see below:
To start up the service (e.g. after a cluster downtime period), run the following command. To be sure it's working, check the GUI.
SyncoveryCL start /INI="/home/ROSENGROUP/software/syncovery/Syncovery.cfg"
To stop the service, run the following command:
SyncoveryCL stop /INI="/home/ROSENGROUP/software/syncovery/Syncovery.cfg"
If you ever need to entirely re-configure the server (e.g. after Syncovery is updated or when starting fresh):
SyncoveryCL SET /WEBSERVER=localhost /INI="/home/ROSENGROUP/software/syncovery/Syncovery.cfg"
Note that these administrative actions should only be done by a single individual since if one user runs the service, another user cannot stop that service.
Cloud StorageIn addition to the above file systems, you also have access to the following cloud storage solutions (see ﻿ ⁠⁠Group Accounts⁠ ):
A 25 TB group account on SharePoint to back up any research data.
You can use Syncovery to back up data to SharePoint as outlined  here .
SharePoint can be accessed from the local OneDrive client if you prefer to have desktop access.
A 1 TB group account on Dropbox.
You can also  request  your own personal 1 TB Dropbox account if desired.
Make sure that when you are preparing manuscripts that you are doing so from a cloud-based backup service, ideally Sharepoint given its strong integration with Microsoft Word.
SyncoveryThe group has access to  Syncovery  (registration code in our group Dropbox) to schedule automated data transfers. Syncovery can be installed locally on your computer, and there is also a dedicated Syncovery service running on Tiger for faster transfers on that machine, as described below. Syncovery is capable of transferring data between virtually any resource — your laptop, the supercomputers, and essentially all cloud resources.
Physical StorageGroup members can have a physical, portable SSD purchased by the lab. It can be used as desired for data backups. It is also quite useful if you need to do a ton of file I/O locally. For instance, if you want to unpack a zip file of 1 million files, it's better to do that on a portable SSD than on your local machine.
﻿