The sample locator: A federated search tool for biosamples and associated data in Europe using HL7 FHIR

Cecilia Engels,Jori Kern,Zdenka Dudová,Noemi Deppenwiese,Alexander Kiel,Björn Kroll,Tobias Kussel,Christina Schüttler,Radovan Tomášik,Michael Hummel,Martin Lablans,German Biobank Alliance (GBA) IT development team,Martin Breu,David Croft,Christoph Dolch,Petra Duhm-Harbeck,Lars Ebert,Cäcilia Engels,Christian Knell,Ann-Kristin Kock-Schoppenhauer,John Linde,Christian Maier,Michael Neumann,Matthias Öfelein,Matthias Rambow,Susanne Sahr,Florian Stampe,Deniz Tas,Hannes Ulrich,Hans-Ulrich Prokosch
DOI: https://doi.org/10.1016/j.compbiomed.2024.108941
Abstract:Background: This study outlines the development of a highly interoperable federated IT infrastructure for academic biobanks located at the major university hospital sites across Germany. High-quality biosamples linked to clinical data, stored in biobanks are essential for biomedical research. We aimed to facilitate the findability of these biosamples and their associated data. Networks of biobanks provide access to even larger pools of samples and data even from rare diseases and small disease subgroups. The German Biobank Alliance (GBA) established in 2017 under the umbrella of the German Biobank Node (GBN), has taken on the mission of a federated data discovery service to make biosamples and associated data available to researchers across Germany and Europe. Methods: In this context, we identified the requirements of researchers seeking human biosamples from biobanks and the needs of biobanks for data sovereignty over their samples and data in conjunction with the sample donor's consent. Based on this, we developed a highly interoperable federated IT infrastructure using standards such as Fast Healthcare Interoperability Resources (HL7 FHIR) and Clinical Quality Language (CQL). Results: The infrastructure comprises two major components enabling federated real-time access to biosample metadata, allowing privacy-compliant queries and subsequent project requests. It has been in use since 2019, connecting 16 German academic biobanks, with additional European biobanks joining. In production since 2019 it has run 4941 queries over the span of one year on more than 900,000 biosamples collected from more than 170,000 donors. Conclusion: This infrastructure enhances the visibility and accessibility of biosamples for research, addressing the growing demand for human biosamples and associated data in research. It also underscores the need for improvements in processes beyond IT infrastructure, aiming to advance biomedical research and similar infrastructure development in other fields.
What problem does this paper attempt to address?