Bringing PanglaoDB to 5-star Linked Open Data using Wikidata

Tiago Lubiana,João Vitor F. Cavalcante
DOI: https://doi.org/10.1101/2024.04.12.589259
2024-04-15
Abstract:PanglaoDB is a database of cell-type markers widely used for single-cell RNA sequencing data analysis. However, cell types and genes in the database are encoded by free text, lacking proper identifiers. Wikidata, is a freely editable knowledge graph database useful for integrating biomedical knowledge. We thus reasoned that porting PanglaoDB’s markers to the platform could improve their reusability and overall technical quality (FAIRness). We mapped 188 cell types from PanglaoDB to species-neutral terms on Wikidata and created 376 species-specific terms for cell types in and . These terms were enriched with marker information via the (P8872) property, totaling over 15.000 cell type X marker associations ( ). We explored this new subset of the graph via SPARQL queries, illustrating the discovery potential of structured, integrated knowledge. For example, we found a previously unexplored link between rosehip neurons, clozapine, and schizophrenia via the marker. Besides the graph-based insights, we took time to describe the details of the reconciliation process, hoping to stimulate more resources for a move to a 5-star linked open data format.
Bioinformatics
What problem does this paper attempt to address?