HPC compatibility ¶
This software was designed to be easily scalable on High Performance Computing (HPC) facilities. In this page, we show how to access and work on two facilities from the Lawrence Berkeley National Laboratory, the Lawrencium and Cori clusters.
Supercomputer access ¶
Lawrencium cluster (LBNL) ¶
Using your credentials, you can access the Lawrencium cluster from the Login Node as follows:
ssh <username>@lrc-login.lbl.gov
The DAS data in Lawrencium can be found in the
bear
mounted drive under the
/clusterfs/bear/ML_trainingDataset/
repository. In order to transfer data to or from Lawrencium, we can follow the instructions in
this page
and use the Data Transfer Node with Secure Copy (
scp
), for instance as follows:
scp -r <username>@lrc-xfer.lbl.gov:/clusterfs/bear/ML_trainingDataset/1min_ch4650_4850 .
The above command will be 10 days worth of 1-minute 200-channel Distributed Acoustic Sensing data to your local repository. Data should generally be stored under the so-called SCRATCH folder, the full path to that folder in Lawrencium is as follows:
/global/scratch/<username>
Interactive:
salloc -N 3 -t 3:00:00 -C haswell -q interactive
Cori cluster (NERSC) ¶
Software environment ¶
Load/unload modules ¶
Conda environment ¶
In some clusters, like in Lawrencium, there might not be a module available that already contains all the dependent packages needed to fully run the MLDAS software. In such case, a custom Conda environment containing all the dependencies can be created and loaded every time one wishes to use the software. The procedure is pretty simple and consists first of loading Python:
module load python/3.6
Then, we can initialize and activate the custom environment straightforwardly as follows:
conda create -p /global/scratch/vdumont/myenv python=3.6
source activate /global/scratch/vdumont/myenv
Once activated, we can install all the dependencies to make the MLDAS software work. We note that the default PyTorch package will be the one with CUDA enabled, which is what we want:
conda install h5py matplotlib numpy pillow pyyaml scipy pytorch torchvision
pip install hdf5storage
MATLAB module ¶
Some scripts are written in MATLAB and in order to execute those scripts on the cluster, the software should be loaded. There are several version available, we found that
matlab/r2017b
is pretty stable and loads quickly:
module load matlab/r2017b
Below we show and example command line to execute a script on the terminal.
Warning:
While the filename for a MATLAB script usually ends with
.m
, the extension should not be written when calling the script in the command line:
matlab -nodisplay -nosplash -nodesktop -r SCRIPT_convert_dsi2hdf5
Manual Installation ¶
Octave on Lawrencium ¶
https://wiki.octave.org/wiki/index.php?title=Building https://bugs.gentoo.org/730222 https://trac.macports.org/ticket/49824
Account & Jobs ¶
Account manager ¶
It can happen that you don’t remember which account and/or partitions your username is associated to, such information can be easily retrieved using the
sacctmgr
command as follows:
sacctmgr show assoc user=<username>
sinfo --partition lr3
Monitoring usage ¶
In Cori, one can display the available storage space by typing
myquota
in the terminal.