Running your first MatFlow simulation
Installing MatFlow
For installation on the CSF follow instructions here: https://github.com/LightForm-group/UoM-CSF-matflow
Running a workflow
MatFlow workflows are in the form of .yaml or .yml text files. This file includes in list format the details of the individual tasks to perform in the order that they appear in the file.
A good example workflow is tension_DAMASK_Al.yml, a simulation that does a uniaxial tensile test of aluminium using DAMASK. This example workflow is in the “workflows” folder of the Uom-CSF-matflow repository. For convenience, this repository is also synchronised to the group shared directory on the research data storage (RDS). You can either git clone the repository to your user space on the CSF or you can directly use the workflow from the shared drive at /mnt/eps01-rds/jf01-home01/shared/matflow/
.
Computationally complex workflows should not be run on your home directory (~/.
), but should be run on the scratch space. A link to the scratch space can be found in your home directory.
A MatFlow script can be run using the command matflow go script_name
where script_name
is the path to the script you want to run. The workflow will be run and the results stored in the directory you are in when you run this command (not the directory the script is stored in).
Putting these steps together to run the workflow might look something like:
cd ~
git clone https://github.com/LightForm-group/UoM-CSF-matflow
cd scratch
mkdir matflow_simulations
cd matflow_simulations
matflow go ~/UoM-CSF-matflow/workflows/tension_DAMASK_Al.yml
Managing the queue
If it works successfully MatFlow should process the job in a matter of seconds. MatFlow only schedules the work to be done, it doesn’t do it directly. Jobs are added to the queue on the CSF and run sequentially. You can see the status of any queued or running jobs using the qstat
command.
Once the job starts the output will go into a directory labelled with the workflow name and the date. If you submit a job and later want to cancel it you can use the command matflow kill /path/to/workflow/directory
.
Looking at the output
The sample job tension_DAMASK_Al.yml
should take 5 to 10 minutes to run. After it is complete you can see the results in the generated output directory.
- A copy of the original submission script is placed in the output directory for reference to show which commands were run.
- One folder for each task in the workflow, these contain files generated as part of that task and the console output generated by the running of that task.
- The output folder contains the log files for the job submission scripts
- The main output is then stored in the workflow.hdf5 file
Understanding the output
The output of the MatFlow run is stored in the HDF5 format. This is a compressed hierarchical binary format. For a quick preview of the contents the program hdfview can be used. However for most purposes it is likely you will want to use a Python script to parse the results.
Using Dropbox synchronisation
MatFlow provides the option of synchronising the completed workflow to a Dropbox directory. In order to do this you must first tell MatFlow about Dropbox by adding some lines to your matflow config. Instructions to do this can be found here: https://github.com/LightForm-group/UoM-CSF-matflow/#setting-up-dropbox-archiving. You need to add a folder to the path:
key that already exists in your Dropbox. A good idea would be to create a dedicated folder in your dropbox named something like matflow-outputs
and use this.
After this you must initiate a connection between MatFlow and your Dropbox account. You can do this using the command matflow cloud-connect --provider Dropbox
. MatFlow will prompt you to follow a link and authenticate the app on the Dropbox website. It is recommended that you attach MatFlow to your university Dropbox account as this is likely to have more storage space than your personal account.
After authentication you can now enable Dropbox archiving by adding the key archive: dropbox
to your workflows. In the case of the test workflow tension_DAMASK_Al.yml, you can see that the archive: dropbox
key was commented out. You can activate it by removing the # symbol and space before the archive
key. If you run the simulation again you should see that the workflow result is synchronised to your Dropbox after completion.