Biodiversity data standards for the organization and dissemination of complex research projects and digital twins: a guide

Carrie Andrew,Sharif Islam,Claus Weiland,Dag Endresen
2024-05-30
Abstract:Biodiversity data are substantially increasing, spurred by technological advances and community (citizen) science initiatives. To integrate data is, likewise, becoming more commonplace. Open science promotes open sharing and data usage. Data standardization is an instrument for the organization and integration of biodiversity data, which is required for complex research projects and digital twins. However, just like with an actual instrument, there is a learning curve to understanding the data standards field. Here we provide a guide, for data providers and data users, on the logistics of compiling and utilizing biodiversity data. We emphasize data standards, because they are integral to data integration. Three primary avenues for compiling biodiversity data are compared, explaining the importance of research infrastructures for coordinated long-term data aggregation. We exemplify the Biodiversity Digital Twin (BioDT) as a case study. Four approaches to data standardization are presented in terms of the balance between practical constraints and the advancement of the data standards field. We aim for this paper to guide and raise awareness of the existing issues related to data standardization, and especially how data standards are key to data interoperability, i.e., machine accessibility. The future is promising for computational biodiversity advancements, such as with the BioDT project, but it rests upon the shoulders of machine actionability and readability, and that requires data standards for computational communication.
Other Quantitative Biology
What problem does this paper attempt to address?