WUR Data Steward activities for Systems and Synthetic Biology and Microbiology: an interview with Bart Nijsse and Jasper Koehorst
How is it ensured that large datasets are properly stored? How do we come to uniformity? Who monitors our data? Data is called the new gold. We want to take good care of something valuable. Bart Nijsse is the guardian of the data of the Laboratory of Systems and Synthetic Biology and the Microbiology science groups. Together with Jasper Koehorst, he has developed a data storage system.
Bart Nijsse is the first data steward within the Laboratory of Systems and Synthetic Biology and Microbiology Science groups. He studied Bioinformatics at the HAN University of Applied Sciences. After his study he started working at the WUR. A bioinformatician is used to work with big datasets but data management was quite a new focus for him. Jasper Koehorst, his co-partner, started as a PhD at the Wageningen University and Research (WUR) and is now a postdoc. Together they have developed a data storage system.
The SSB data management framework
The Investigation Study Assay (ISA) Framework is a standard that Bart and Jasper use as a basis for their data storage system: The System and Synthetic Biology (SSB) data management Framework. ISA is widely used in the bioinformatics technology world. It is the standard of how you manage your datasets. They have made a basic template for that and they are now working on standardizing the omics (DNA, RNA) data. These are a few gigabytes per dataset and dozens of datasets per project. They have already stored quit a few terabytes of data. Jasper and Bart have given their own twist to it for example, by adding projects to it. They use an excel file as a template that the researcher can fill in. This excel file can be seen as an advanced data management plan. The data management storage was initially only for systems biology and microbiology, but they develop in such a way that other parties can copy it and set it up themselves.
Bart and Jasper ensure that all the data comes in IRODS, which is their data storage system that they are currently testing in collaboration with the Wageningen Data Competence Centre (WDCC) and SURF Sara. Since one and a half year they are busy making a professionalization.
Bart and Jasper are with their SSB data management Framework precursors within the WUR. They work as FAIR as possible. “We were already working FAIR before FAIR was introduced!” according to Jasper.
Data sharing with third parties
Jasper works a lot with parties from outside WUR who generate a lot of data on the working field. To use this data, it has to be placed in their special format. Everyone who works with them should do this. For example, they collaborate with Delft. Delft uses the same excel sheet as the WUR researchers and must complete all research questions: What kind of research are you going to do? Who are your project supervisors? What are the data that you generate? In this way data from third parties can be stored in the system.
Data Steward tasks
As mentioned earlier Bart is the Data Steward (DS) of his science group. There are no time limits on his job but certainly one day a week he is spending on DS tasks. “There is an increasing demand of support from me”, Bart tells me during the interview. “Because all data that comes in first comes to me and then I have to process it as well as possible”. Bart is the main responsible for the data. Employees only receive the data once they have completed the metadata excel sheet that Bart and Jasper have developed for registering metadata. Bart must constantly check the data because it appears that the data is often not downloaded properly. These checks are still done manually, but all will soon be automated and placed directly in IRODS.
WUR is serious about data
With this series on WUR Data Steward, we want to show you what is happening within the WUR in the area of Data Stewardship. You also can read our series with the blog posts on WUR Data Stewardship and stay tuned! More WUR Data Steward Champions are to come!
Data Management Support