An upcoming series of focused episodes on RDM for Core Facilities
In recent years, it has been increasingly recognised that sharing original data enhances transparency, reproducibility, and scientific impact. This shift has prompted a growing international movement to make publicly funded research data accessible to all. At the same time, turning this goal into reality presents challenges for both researchers and infrastructure providers. In the new spotlight from Global BioImaging, we want to open a dialogue with the community on both experiences and guidelines for successfully managing data in core facilities.
Facilities often support a large number of users with diverse experimental workflows, instrument requirements, and levels of experience. As a result, consistent data organisation becomes critical to effectively navigate large and heterogeneous datasets. Streamlined workflows help reduce confusion, prevent duplication of effort, and optimise storage use—avoiding the accumulation of unnecessary or redundant data.
However, from a researcher’s perspective, the idea of sharing data often raises concerns, particularly the fear of being scooped or losing control over unpublished results. It’s important to clarify: data does not need to be made publicly available the moment it is generated. Instead, the key is to ensure that data are well-organised and FAIR (i.e., Findable, Accessible, Interoperable, and Reusable) by the time of publication. This timing protects the scientific lead of the researcher while aligning with journal, funder, and institutional requirements. Well-organised data provides clear benefits for the research community, but also for the individual researcher:
improved understanding of the data, especially when thoughtful organisation is applied early in the experimental design
a greater scientific impact, as others build upon the published work reduced delays and stress at the point of submission, as data are already well-documented and publication-ready
improved credibility and transparency, reducing the risk of results being challenged or misunderstood
increases visibility and citation rates when datasets are discoverable and reusable
Various initiatives (listed in Table 1) have aimed at establishing or promoting open science, that is, making scientific results from publicly funded research accessible. Notably, the FAIR principles (Wilkinson et al., 2016) introduced a widely applicable framework for data management and data stewardship, including data that cannot be openly shared for valid reasons. Since then, many initiatives have promoted FAIR data by developing tools and standards to facilitate adherence to these principles across diverse scientific domains, including bioimaging. FAIR data enables reuse beyond the original context, boosting impact—as reflected, for example, by higher citation rates of publications associated with shared data —and thereby fostering trust and reliability for more effective and efficient scientific research. As a result, data producers and managers are required to ensure that data are described properly and can be shared via institutional or disciplinary repositories. Research data management (RDM) has thus become an integral mission of imaging facilities,,. RDM is often defined as all the activities in the research process that are not data analysis or processing. Specifically, research data management refers to the organization and handling of research data throughout the research data lifecycle from data management planning and data creation through to storage, backup, retention, archiving, destruction, access, preservation, curation, dissemination (publication, sharing, reuse), documentation and description of data, including the complementary algorithms, code, software and workflows that support research data management practices.
🌍 Plan S
Region: Europe
Overview: A major European initiative promoting immediate open access to publicly funded research.
Link: https://www.coalition-s.org/why-plan-s/
🇺🇸 Nelson Memo (2022)
Region: United States
Overview: U.S. policy mandating free, immediate, and equitable access to federally funded research outputs.
Link: https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf
🇨🇦 Tri-Agency Research Data Management Policy (2021)
Region: Canada
Overview: National framework guiding research data management practices across Canadian funding agencies.
Link: https://science.gc.ca/site/science/en/interagency-research-funding/policies-and-guidelines/research-data-management
🇯🇵 Integrated Innovation Strategy (2021)
Region: Japan
Overview: Japan’s national strategy emphasizing open science and data-driven innovation.
Link: https://www8.cao.go.jp/cstp/english/strategy_2021.pdf
🇳🇱 Dutch Open Science Initiative & OpenScience NL
Region: Netherlands
Overview: Coordinated national effort to advance open science practices and infrastructure.
Link: https://www.openscience.nl/
🇩🇪 National Research Data Infrastructure (NFDI)
Region: Germany
Overview: A nationwide initiative to systematically manage and make research data accessible across disciplines.
Link: https://www.nfdi.de/association
🇫🇷 National Plan for Open Science
Region: France
Overview: France’s strategy to promote open access, open data, and transparency in research.
Link: https://www.ouvrirlascience.fr/wp-content/uploads/2019/08/National-Plan-for-Open-Science_A4_20180704.pdf
🇮🇹 National Program for Open Science
Region: Italy
Overview: Italy’s roadmap for implementing open science policies and strengthening research data sharing.
Link: https://www.mur.gov.it/sites/default/files/2022-06/Piano_Nazionale_per_la_Scienza_Aperta.pdf
🌐 OECD Recommendation on Access to Research Data
Region: International
Overview: Global policy framework encouraging access to publicly funded research data.
Link: https://legalinstruments.oecd.org/api/print?ids=159&lang=en
🇬🇧 UKRI Open Research
Region: United Kingdom
Overview: UKRI’s approach to embedding open research practices across funded projects.
Link: https://www.ukri.org/manage-your-award/good-research-resource-hub/open-research/
Compared to other disciplines, bioimaging has only relatively recently joined the "big data" revolution, resulting in an exponential increase in demand for data storage and organisation strategies. Bioimage data pose unique challenges. The diversity of techniques, growing number of instruments, acquisition systems, file formats, and metadata challenges potential data managers (Linkert et al. 2010). Additionally, the size, frequency, and high-dimensionality of acquisitions make even straightforward experiments true “big-data” endeavours given the high volumes, variety and velocity of data created (Poger, Yen, and Braet 2023; Ouyang & Zimmer, Imaging Tsunami 2017). In addition, recent advances—such as automated imaging pipelines and the use of AI-driven image analysis—have increased data throughput while requiring more structured data input. Bioimaging is also increasingly combined with other data modalities, such as in spatial transcriptomics, creating further integration and metadata challenges (https://doi.org/10.1016/j.xgen.2023.100374).
Due to these fundamental issues and the rapid developments in, for example, omics and correlative and multimodal imaging techniques, image data management has often been perceived as daunting, leading it to be avoided, underfunded, or even considered impossible. Though challenging, the fact is that with the proper preparation, image data management can be valuable to all involved. Exactly those features that make image data difficult to work with — size, dimensionality, multi-modality — also make them valuable. The spatially correlated information is precisely difficult to summarise or compress because of its richness, and must be looked at in context. Fast visualisation is key.
Fortunately, there is now a growing collection of strategies and efforts to support the different phases of image data management, from initial capture to reuse. GBI’s Image Data Working Group is preparing a series of episodes that are intended to make core facility staff, with and without an IT background, aware of these advancements and help plan their rollout in the local setting. Though we will touch on issues at scale, these are primarily intended more for beginner and intermediate practitioners. Similarly, though others related to the research data management, including core IT services, may find this checklist of interest, some familiarity with bioimaging is generally assumed.
The episodes that follow will cover the top ten suggestions from the Global Bioimaging Image Data working group, a collaborative international team of image data experts, for successfully implementing image data management. The steps build on one another, and should provide even novices with the confidence to begin successfully building an image data management strategy from the ground up. This community is here to help. Whether you're starting from scratch or refining an existing workflow, we hope this guide offers practical insights, community-driven wisdom, and encouragement.
Name: Image Data WG
Link: https://globalbioimaging.org/working-groups/image-data-management
Description: Homepage of the Image Data Management Working Group. Includes information on how to contact and join the community. |
Name: Image.sc
Link: https://image.sc/
Description: Community forum for image analysis and research data management software.
| Name: RDMkit
Link: https://rdmkit.elixir-europe.org/
Description: Research Data Management toolkit for life sciences—a strong starting point with best practices and guidelines. |
| Name: GBI Spotlights
Link: https://docs.google.com/forms/d/e/1FAIpQLSdMq8lrPmIGf0Djp-RM40VBKwwU8J2SLqS_oX0UcwIqrT-JDA/viewform
Description: Register to join live discussions, share your challenges and solutions, and engage with a global peer community. Webinars will also be made available on GBI’s YouTube channel. |
This project has been made possible in part by a grant from the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation.