Interview: Data Stewardship at the Departmental Level – Insights from D-HEST
This conversation between PD Dr Jochen Klumpp (IT Services Group (ISG), D-HEST) and Dr Julian Dederke (ETH Library) is the third instalment in a series of interviews explaining the work of data stewards within the Data Stewardship Network and showcasing various data stewardship models at ETH Zurich.
Jochen Klumpp works at ETH Zurich since 2005. Initially a post-doctoral researcher, he later became a senior research associate in the Laboratory of Food Microbiology. As of July 2015, he holds the full-time position of deputy IT support group leader at D-HEST. Jochen’s professional background is in biology and information technology. He is also a private lecturer for food microbiology at ETH. In 2023, he joined a pilot group of data stewards within the scope of the swissuniversities data stewardship project.
The interview was conducted by Dr Julian Dederke (ETH Library), who coordinates a swissuniversities project to establish data stewardship at ETH Zurich. The promotion of data stewardship is an aim of both the external page National Strategy for Open Research Data and the external page Open Research Data programme of the ETH Domain.
Jochen, you are currently the deputy ISG leader at D-HEST and a member of the Laboratory of Food Microbiology. What exactly are your tasks?
ISG-HEST currently provides IT services and support to 70 research groups and units at D-HEST and some at D-USYS. This means that we face highly heterogeneous and changeable requirements for research-oriented IT systems. We cannot just set up a homogeneous IT ecosystem such as an administrative facility might operate. We have to be very responsive to the wishes and needs of the individual researchers. These also revolve around the way in which research data, the most important “product” of a university, are managed.
My dual role in both microbiological research and information technology has given me a good understanding of both worlds. I am responsible for advising researchers at D-HEST and D-USYS on all aspects of data management and processing, but also for maintaining our business relations with professors from both departments. We try to stay up to date with ongoing research in the groups, so we can better understand our customers’ wishes and provide them with useful solutions.
Thanks to the insights I have gained from my own teaching, I can see the challenges that arise when trying to impart good research practices to students. I think that a lot of work is needed to familiarise students with the basics of data management and the FAIR principles at an early stage.
What questions and problems do you hear most frequently from researchers?
Pretty much anything relating to the acquisition, maintenance, archiving and publication of research data. Recently, I received an inquiry from a research group trying to make their data and their analysis tool, which they have developed in-house, available to the public. They are struggling with technical and organisational obstacles.
Questions about storing and processing biomedical and clinical data are very common, too, especially the question: “Is this allowed?” Another hurdle is that researchers often need to share data with external project partners or give external users access to ETH analysis systems to facilitate effective collaboration.
I get a lot of technical questions, too. “How do I move data from system X to system Y?” “How do I send or receive large volumes of data?” Plus all sorts of things related to data life cycle management.
Your work at D-HEST also has you involved in ETH Zurich’s Data Stewardship Network as a member of a pilot group of data stewards. How does this intersect with your other tasks?
Many of my tasks at ISG revolve around data management: Consultancy and implementation of solutions for laboratory management systems, electronic laboratory notebooks, data storage, maintenance and security, and the big topic of handling clinical data in research. Lots of groups work with legacy data structures, and they struggle to adapt their research data management to new regulations and requirements. ISG-HEST seeks to be the first contact point for researchers in such situations. We provide advice and, if necessary, recommend further contacts or establish our own solutions.
As a member of the Data Stewardship network, I quickly realised that many groups at ETH face similar challenges, which has led to some very creative and efficient solutions. While it is not always possible to directly transfer this from one research area to the other, there are plenty of synergies one can use to improve the public availability of data.
In your view, what are the advantages of handling data stewardship tasks at the departmental level?
Clearly, the main advantage is independence and a view of the bigger picture. I don’t need to find isolated solutions for my own data. Instead, I can provide consultancy that is free from personal preferences. We advise many research groups at D-HEST and D-USYS and are well networked with the IT units at other departments and IT Services. This allows us to provide efficient, comprehensive advice and solutions. Good knowledge of the existing solutions is especially important in data management, so the individual research groups don’t need to keep reinventing the wheel.
swissuniversities ORD project on the topic of data stewardship
As part of the external page National ORD Strategy, swissuniversities encourages all research institutions in Switzerland to promote data stewardship and create appropriate incentives. One of the ways ETH Zurich pursues this aim is through a project running from 2023 to 2024 coordinated by the ETH Library. You can find all the information you need on data stewardship on our website.
What recurring challenges do researchers encounter in the area of research data management? What are your experiences in this context?
The first thing to mention here is the increased volume of data. High-throughput methods and increased automated data recording produce considerably more data that need to be managed, particularly in biology, medicine, chemistry and physics.
Researchers are also under increasing pressure to release the data supporting their published results and make them freely accessible to the public. In particular, they are expected to prepare and publish the raw data. This requires a reliable data management system, especially when projects run for many years. This contrasts with the strict data security requirements, especially in (bio)medical research.
Do you have any general advice for ETH researchers or research groups wishing to optimise their research data management?
In the past, many researchers only had to think about data management in a structured way, when they needed a data management plan, e.g. for an SNSF application. This is no longer the case. Many researchers spend a lot of time thinking about their data. And that's where my recommendation comes in: One should clearly (and preferably at the level of the whole research group) deal with a formalised data management plan. Such a plan can be very comprehensive, as outlined here, or it can simply outline the main processes and responsibilities. In either case, the way in which research data is handled needs to be clearly structured. It should be specified who can access which data, who can assign rights, who is responsible for data management, and so on. The entry and exit of employees must also be clearly regulated with regard to data storage. And last but not least, the plan also needs to define where, when and how data is published. I advise any group to use a laboratory management system and to work with electronic lab notebooks. These tools make it considerably easier to organise retrace research, which is a great benefit for any research group.
Read the previous interviews and news: