Organisation of research data
All modern research means collecting and processing data. As equipment has increased in complexity and computing power has increased, the amount of data collected and its complexity has also increased. In order to do good reproducible research, it is important that the data is treated correctly. If the data is not treated correctly it can result in drawing inappropriate conclusions which does not make good science.
Good practices for data organisation
Keep raw data
The raw data from a measurement should always be kept. While it is possible to reproduce analysis of a raw dataset, it may not be possible to reproduce the original raw data. Keeping the original raw data is important so that others can reproduce the analysis that you have done.
Ensure data are backed up
Data should be backed up, preferably in more than one location. Where possible, use university networked research data storage/Dropbox as this is much more robust than USB hard disks.
Use version control
For any text based documents such as papers or code, use version control tools to keep a single versioned copy. This reduces the chances of losing vital work and allows easy collaboration with other people.
Further reading:
This is a summary of the points covered in the paper Good enough practices in scientific computing
Use of Dropbox to manage your research data
All LightForm students have Dropbox folders set up which are shared with their supervisor. This should be the primary location used for storage of research data and analysis. We have set up folders with specific purposes.
Writing Folder
What you need to save in the Writing Folder – any of the following:
- Experimental reports
- Literature reviews
- Transfer reports
- Manuscripts
- Any other writing: notes, snippets etc.
How writing should be saved
Each piece of writing should have its own subfolder – the subfolder should be given an appropriate descriptive title, which identifies the content – for example:
Example subfolder title: “Year 1 transfer report” Example subfolder title: “Starting material characterisation report”
The idea is that anyone can look at the titles of the folders and make a good guess about the contents. This shared folder makes it easier to write collaboratively.
Presentations Folder
What you need to save in the Presentations Folder – any of the following:
- All presentation (PowerPoint) files
- Posters
- Progress updates (with industrial sponsor for example)
- Any other slide based files
How presentations should be saved:
Each presentation will have its own subfolder – subfolders should be given an appropriate descriptive title which includes the event title and date – for example:
Example subfolder title: “PowerPoint Industrial sponsor update 05.02.19”
Literature Folder
What you need to save in the Literature Folder – any of the following:
- Literature you have read and used in your research:
- Publications
- Journals
- Papers
- Reports
How literature should be saved:
It is a good idea to use a bibliography reference manager tool for example Zotero, Endnote, Mendeley, in order to reference articles relevant to your project.
You can either point your reference manager to this folder for saving the reference database or export your database as a Bibtex file to here.
You can also store PDF files here if you wish (some articles, papers are not available online). If you do not already use a reference manager tool, speak to your supervisor for further guidance.
It’s important to have a good shared record of the literature you use, so that writing and collaborating with others (including your supervisor) is easy.
Experiments Folder
What you need to save in the Experiments Folder:
- All raw data
- Analysis
How Experiments should be saved:
In the Experiments Folder you will have an individual folder for each experiment.
Name this appropriately, describing the data type/experiment and if relevant the date:
Example experiment folder title : “Starting material optical metallography October 2019” Example experiment folder title: “Synchrotron experiment December 2020”
Then within each individual experiment folder, you will have a further 2 sub-subfolders:
Sub-subfolder 1: Data
All raw data relating to the individual experiment – there should be no data analysis presented or stored in the Raw Data subfolder.
This is the data that will be used for updating to the repository on publication/thesis completion. When completed, the experimental data can be uploaded to ZENODO.
You experimental data files should also be named sensibly. See guidelines on the template for your experiment.
Sub-subfolder 2: Analysis
Analysis should contain files with the analysis of the data: Excel Spreadsheets, Python Scripts, Matlab Scripts etc.
Analysis should if possible link directly to the data in the Data sub-subfolder. Alternatively (e.g. with spreadsheets) you should make a copy for analysis purposes. Do not analyse data in the Data folder.
Subfolders within Analysis should be given an appropriate descriptive title. Include an abbreviation of the analysis title and date.
Progress Summary Folder
What you need to save in the Progress Summary Folder:
- “Progress Summary Spreadsheet” in which you detail your month by month progression
The Progress Summary Spreadsheet enables the Project Manager to see at a glance, progress to date and will be discussed at your supervision meetings. You will have only one Progress Summary sheet which you update on a month by month basis.
Please note: You will also use the Progress Summary sheet to update the team on progress at monthly theme meetings.