Toward integration of glycan chemical databases: an algorithm and software tool for extracting sugars from chemical structures

Masaaki Matsubara,Evan E. Bolton,Kiyoko F. Aoki-Kinoshita,Issaku Yamada
DOI: https://doi.org/10.1007/s00216-024-05508-1
2024-08-30
Analytical and Bioanalytical Chemistry
Abstract:Abstract Integration of glycan-related databases between different research fields is essential in glycoscience. It requires knowledge across the breadth of science because most glycans exist as glycoconjugates. On the other hand, especially between chemistry and biology, glycan data has not been easy to integrate due to the huge variety of glycan structure representations. We have developed WURCS (Web 3.0 Unique Representation of Carbohydrate Structures) as a notation for representing all glycan structures uniquely for the purpose of integrating data across scientific data resources. While the integration of glycan data in the field of biology has been greatly advanced, in the field of chemistry, progress has been hampered due to the lack of appropriate rules to extract sugars from chemical structures. Thus, we developed a unique algorithm to determine the range of structures allowed to be considered as sugars from the structural formulae of compounds, and we developed software to extract sugars in WURCS format according to this algorithm. In this manuscript, we show that our algorithm can extract sugars from glycoconjugate molecules represented at the molecular level and can distinguish them from other biomolecules, such as amino acids, nucleic acids, and lipids. Available as software, MolWURCS is freely available and downloadable ( https://gitlab.com/glycoinfo/molwurcs ). Graphical Abstract
chemistry, analytical,biochemical research methods
What problem does this paper attempt to address?