Getting help with MatFlow - LightForm Wiki

Getting help with MatFlow workflows

First of all, make sure your packages are up to date, with this command on the CSF:

/mnt/eps01-rds/jf01-home01/shared/matflow/update_matflow.sh

If you are having problems with loading your workflow locally (e.g. in a Jupyter notebook), make sure your local packages are up to date (run this on your computer):

pip install -U matflow damask-parse formable matflow-damask matflow-formable matflow-defdap matflow-mtex matflow-neper matflow-demo-extension

If you are still having problems, post a new GitHub issue in the installation repository (UoM-CSF-Matflow), using the “workflow problem” issue template. If you are certain that a bug exists in one of the MatFlow extension packages, or MatFlow or HPCFlow, then please create the GitHub issue in one of the respective repositories.

Suggestions for new extensions/tasks/methods

Please add a new issue to the installation repository (UoM-CSF-Matflow)

Importing large parameters

You may have issues when using import to re-use workflow parameters from an existing workflow, if the parameters are larger than the available memory on the login node at submission time. To prevent this, you can first submit, using qsub, a jobscipt that runs matflow make:

#!/bin/bash --login

#$ -cwd
#$ -N mf_make
#$ -pe smp.pe 6   # specify whatever resources are required to access sufficient memory

export HDF5_USE_FILE_LOCKING=FALSE
export OMP_NUM_THREADS=1

matflow make workflow_file.yml

Once this has run, a new workflow directory should be generated. You can then submit the workflow via this directory with matflow go /path/to/workflow/directory.

FAQs

When I submit a workflow I get a message like “The following schemas are invalid…”; what does this mean?

This indicates that some of the task schemas cannot be used, given the extension packages that you currently have installed. This is not a problem, unless you want to use one of those tasks. If you do try to use one of those tasks in a workflow profile, you will receive a more obvious error from MatFlow.

Troubleshooting

Installation failed due to an error message: damask-parse current_version has requirement numpy>=1.17.5 but you'll have numpy other_version which is incompatible

Type pip list --user into the command line.
- This should give you a list of modules installed on your user account.
Find the version of numpy in the list and check if it is the correct version.
If the version is incorrect, then type pip uninstall numpy
- This should prompt a Y/N answer, say yes.
Type pip install --user numpy==1.17.5 to reinstall the correct version of numpy.
Check the installation with matflow validate

My workflow didn’t run

Type matflow validate into the terminal.
- This ensures that matflow is installed correctly
- If your workflow does not work, go to 2.
Run the following in the terminal: /mnt/eps01-rds/jf01-home01/shared/matflow/update_matflow.sh.
- This ensures that you are using the latest stable version of matflow
- If your worlflow does not work, go to 3
Check if there is an error with line numbers displayed in the CSF interface.
- If yes, that means that there is likely an error in the format of your YAML file, go to 4.
- If no, go to 5.
Check the yaml file on https://yamlvalidator.com . Make sure there are no indentation errors.
- If your workflow still does not work, go to 5.

Go to ~/.matflow/ and check config.yml. It should look like this:

 task_schema_sources:
 - /mnt/eps01-rds/jf01-home01/shared/matflow/task_schemas.yml
 software_sources:
 - /mnt/eps01-rds/jf01-home01/shared/matflow/software.yml
 parallel_modes:
   MPI:
     command: mpirun -np <<num_cores>>
   OpenMP:
     env: export OMP_NUM_THREADS=<<num_cores>>

 default_preparation_run_options:
   l: short

 default_processing_run_options:
   l: short
      
 default_iterate_run_options:
   l: short

If it is correct, go to 6.

Look for stderr.log in the simulate_volume_element directory.
- If you’ve found it, go to 7.
- If there is no such directory, the workflow did not run at all, go to 8.
Read the error at the bottom of the log file. Comment out the relevant tasks, starting from the one at the bottom of the error message. This is to isolate the problem out.
- If stderr.log is all 0s, then matflow go to 8.
- If you have tried commenting out all of the tasks, go to 8.
- If your workflow worked after commenting out tasks, go to 10.
Use one of the example workflows.
- If the example workflow is not working, repeat with a different example.
- If you have tried all of the examples go to 1 or contact a member of the team.
- If the example workflow works go to 9.
Compare your workflow against the working example workflow at https://text-compare.com and see what the differences are in the relevant task(s).
- This is to isolate and fix the relevant task(s).
You have now identified the relevant task(s) that failed. Please refer to the corresponding troubleshooting section for that task.

There’s no visualisation of the results / There’s no .vtr in the simulate_volume_element_loading task directory –WIP (more errors needed and more thorough solutions needed)

Navigate to the output directory of your simulation (the directory address should look like this: #/scratch/Your_Task_Name_and_Date/output)
Read t5_pro.o------- (or t4_pro.o------- / t6_pro.o------- depending on how many tasks you have, it should be the final task)
Read the error message:
- If it is something like Failed to execute the output map for output "volume_element_response". Exception was: Unable to allocate 72.0 MiB for an array with shape (1048576, 3, 3) and data type float64, Go to 4
- If it is something else (I have not encountered other errors yet, please provide additional errors)
Try lowering your modelling domain size (bearing in mind the z-dimension must be divisible by the number of cores), or lower the number of increments to be visualised (e.g. visualising every other increment rather than all of them). If the above is not an option, go to 5
Open your /yaml file and add the following lines into the simulate_colume_element_loading task:
```
 run_options:
   num_cores: 8
   processing:
     l: mem256
```
and try again

If it still fails in the same way then replace the above with:

 run_options:
   num_cores: 16
   processing:
     l: mem512

If that still doesn’t work, try:

 run_options
   num_cores: 16
   processing:
      num_cores: 4
      l: mem512

Other tasks’ troubleshooting WIP