Blog series: WUR shares Data on Forest Ecology and Forest Management
There is no way to ignore it: in science there is an increasing focus on sharing research data and code. The guiding principle for data sharing by both scientific institutions and research funders (such as NWO) is increasingly “as open as possible, as closed as needed”. WUR also included this principle in its strategic plan (no surprise to the regular readers of this Open Science channel).
The research data and data applications that WUR offers are made accessible via the Data Portal. Here, we showcase inspiring examples of research.
In this blog series (WUR shares Data) we talk to (former) WUR researchers, whose dataset is currently in the spotlight on the portal. What is their opinion on data sharing? Do they encounter barriers? And how could we provide support?
For this blog post we interviewed Ir. J.J. Jansen on the FEM growth and yield database. Jansen is retired, his WUR career started in 1978 at the Agricultural College and when he retired in 2008 he worked for the FEM chair group. He remained a guest member there until 2018, eventually retiring completely in January 2020.
Many researchers collaborated on the 21 FEM datasets. The datasets can be obtained through the data archive DANS EASY.
What does the FEM database contain?
The FEM database is a collection of growth and yield data from even-aged monocultures (douglas fir, common oak, poplar, Japanese Larch, Norway spruce, Scots pine, Corsican pine, Austrian pine, red oak and several other species), even-aged mixed species forest plots, uneven-aged natural forest, uneven-aged selection forest and roadside plantations of poplar. FEM’s Growth and Production Research started in 1923 and stopped in 2000. The aim of the research was to be able to predict the stand development of different tree species, using so-called “Yield tables”.
Some of the data, which was recorded on paper, was lost due to multiple moves. Fortunately, the data of a few species (Douglas fir, poplar and pedunculate oak) had already been partially digitized.
To make the database as complete as possible, Ir.J.J. Jansen retyped data from the field booklets, converted data from all kinds of old spreadsheet applications (such as Dbase, Lotus 123 and Quattro Pro), read data from outdated data carriers (such as 5.25 Floppy Disks, 16 inch IBM disks, magnetic tape), edited data from the HOSP ( 3000 sample points) and scanned text with OCR. This data recovery is a very time-consuming activity, which is one of the reasons why we now strongly advice to use sustainable data formats and durable data carriers from the start in data management.
There were no WUR-guidelines to store or manage data. It was our data, from our own chair group. We determined what happened to the data. If I received a request from someone, I just sent some information.
I was very happy when guidelines were issued. It is an advantage to be able to store your data centrally somewhere. We could also be cited in this way.
How did you share data with other researchers at the time?
Many researchers had (and still have) a large network and exchange data among themselves. And if someone asked for data, he could get it. It is give a little, take a little. Often you made a publication together.
How can future researchers work with the FEM data?
At WUR there are two doctoral candidates who use the data; Meike Bouwman is a PhD candidate at FEM and Etienne Thomassen PhD student at “Bosgroep Zuid Nederland” and also affiliated with FEM. Huicui Lu has also used some of the data in her dissertation and students in Leuven are also working on the yield tables.
Unfortunately, WUR no longer gives priority to permanent development of the FEM collections or other long-term data research. No more money will soon become available for a long running process such as 100 years of forest measurements. They are more short-term projects, lasting 4-5 years (shorter cycle of research – collecting data – publishing).
What do you think about the standard guideline “Share data as open as possible, as closed as needed?”
This is a must! I was very happy that the University was going to support this, much more needs to be done with data. My preference is to publish data openly and accessible. And where research is funded with public money, the data must be made public anyway.
Which barriers to (open) data sharing do you see, and can you indicate whether there are solutions for this?
There are always obstacles. Now look at the COVID vaccine, for example. Companies want to make the final vaccine available but not the inventions. This includes (commercial) interests, so companies keep their cards close to their chests. Collaborations with industry or contract research patents can be important.
That is definitely true. We wrote a blog about this. We advise current researchers to conclude a data sharing agreement prior to starting a research project with a commercial party. Data Sharing guidelines – WUR
Thank you very much for your time, it is an impressive study of almost 100 years old!
Within WUR, the Wageningen Data Competence Center is your first entry point for questions on Data Science, Research Data Infrastructure or Data Management. Please contact us when you have questions on Data or Code Sharing, licensing or data sharing agreements.