Getting help with MatFlow
Getting help with MatFlow workflows
First of all, make sure your packages are up to date, with this command on the CSF:
/mnt/eps01-rds/jf01-home01/shared/matflow/update_matflow.sh
If you are having problems with loading your workflow locally (e.g. in a Jupyter notebook), make sure your local packages are up to date (run this on your computer):
pip install -U matflow damask-parse formable matflow-damask matflow-formable matflow-defdap matflow-mtex matflow-neper matflow-demo-extension
If you are still having problems, post a new GitHub issue in the installation repository (UoM-CSF-Matflow), using the “workflow problem” issue template. If you are certain that a bug exists in one of the MatFlow extension packages, or MatFlow or HPCFlow, then please create the GitHub issue in one of the respective repositories.
Suggestions for new extensions/tasks/methods
Please add a new issue to the installation repository (UoM-CSF-Matflow)
Importing large parameters
You may have issues when using import
to re-use workflow parameters from an existing workflow, if the parameters are larger than the available memory on the login node at submission time. To prevent this, you can first submit, using qsub
, a jobscipt that runs matflow make
:
#!/bin/bash --login
#$ -cwd
#$ -N mf_make
#$ -pe smp.pe 6 # specify whatever resources are required to access sufficient memory
export HDF5_USE_FILE_LOCKING=FALSE
export OMP_NUM_THREADS=1
matflow make workflow_file.yml
Once this has run, a new workflow directory should be generated. You can then submit the workflow via this directory with matflow go /path/to/workflow/directory
.
FAQs
When I submit a workflow I get a message like “The following schemas are invalid…”; what does this mean?
This indicates that some of the task schemas cannot be used, given the extension packages that you currently have installed. This is not a problem, unless you want to use one of those tasks. If you do try to use one of those tasks in a workflow profile, you will receive a more obvious error from MatFlow.
Troubleshooting
Installation failed due to an error message: damask-parse current_version has requirement numpy>=1.17.5 but you'll have numpy other_version which is incompatible
- Type
pip list --user
into the command line.- This should give you a list of modules installed on your user account.
- Find the version of numpy in the list and check if it is the correct version.
- If the version is incorrect, then type
pip uninstall numpy
- This should prompt a Y/N answer, say yes.
- Type
pip install --user numpy==1.17.5
to reinstall the correct version of numpy. - Check the installation with
matflow validate
My workflow didn’t run
- Type
matflow validate
into the terminal.- This ensures that matflow is installed correctly
- If your workflow does not work, go to 2.
- Run the following in the terminal:
/mnt/eps01-rds/jf01-home01/shared/matflow/update_matflow.sh
.- This ensures that you are using the latest stable version of matflow
- If your worlflow does not work, go to 3
- Check if there is an error with line numbers displayed in the CSF interface.
- If yes, that means that there is likely an error in the format of your YAML file, go to 4.
- If no, go to 5.
- Check the yaml file on https://yamlvalidator.com . Make sure there are no indentation errors.
- If your workflow still does not work, go to 5.
-
Go to
~/.matflow/
and checkconfig.yml
. It should look like this:task_schema_sources: - /mnt/eps01-rds/jf01-home01/shared/matflow/task_schemas.yml software_sources: - /mnt/eps01-rds/jf01-home01/shared/matflow/software.yml parallel_modes: MPI: command: mpirun -np <<num_cores>> OpenMP: env: export OMP_NUM_THREADS=<<num_cores>> default_preparation_run_options: l: short default_processing_run_options: l: short default_iterate_run_options: l: short
- If it is correct, go to 6.
- Look for
stderr.log
in thesimulate_volume_element
directory.- If you’ve found it, go to 7.
- If there is no such directory, the workflow did not run at all, go to 8.
- Read the error at the bottom of the log file. Comment out the relevant tasks, starting from the one at the bottom of the error message. This is to isolate the problem out.
- If
stderr.log
is all 0s, then matflow go to 8. - If you have tried commenting out all of the tasks, go to 8.
- If your workflow worked after commenting out tasks, go to 10.
- If
- Use one of the example workflows.
- If the example workflow is not working, repeat with a different example.
- If you have tried all of the examples go to 1 or contact a member of the team.
- If the example workflow works go to 9.
- Compare your workflow against the working example workflow at https://text-compare.com and see what the differences are in the relevant task(s).
- This is to isolate and fix the relevant task(s).
- You have now identified the relevant task(s) that failed. Please refer to the corresponding troubleshooting section for that task.
There’s no visualisation of the results / There’s no .vtr in the simulate_volume_element_loading task directory –WIP (more errors needed and more thorough solutions needed)
- Navigate to the
output
directory of your simulation (the directory address should look like this:#/scratch/Your_Task_Name_and_Date/output
) - Read
t5_pro.o-------
(ort4_pro.o-------
/t6_pro.o-------
depending on how many tasks you have, it should be the final task) - Read the error message:
- If it is something like
Failed to execute the output map for output "volume_element_response". Exception was: Unable to allocate 72.0 MiB for an array with shape (1048576, 3, 3) and data type float64
, Go to 4 - If it is something else (I have not encountered other errors yet, please provide additional errors)
- If it is something like
- Try lowering your modelling domain size (bearing in mind the z-dimension must be divisible by the number of cores), or lower the number of increments to be visualised (e.g. visualising every other increment rather than all of them). If the above is not an option, go to 5
- Open your /yaml file and add the following lines into the
simulate_colume_element_loading
task:run_options: num_cores: 8 processing: l: mem256
and try again
- If it still fails in the same way then replace the above with:
run_options: num_cores: 16 processing: l: mem512
- If that still doesn’t work, try:
run_options num_cores: 16 processing: num_cores: 4 l: mem512
Other tasks’ troubleshooting WIP