GeneConnector: Unlocking the full potential of Genbank metadata

Samuel Galvao Elias,Debora Cervieri Guterres,Robert Weingart Barreto,Helson Mario Martins do Vale
DOI: https://doi.org/10.1109/tla.2024.10412034
IF: 0.967
2024-02-03
IEEE Latin America Transactions
Abstract:Genbank currently stands as one of the most significant global repositories of genetic information. However, despite its vast quantity and diversity of data, a considerable portion of the existing records suffer from disjointed and often lacking metadata, failing to provide the necessary context of their acquisition. In light of this, we propose GeneConnector, a tool that harnesses shared information among multiple records of the same specimen in Genbank, aiming to enhance the completeness of poorly annotated nodes across various information domains. To demonstrate the tools capabilities, we conducted a comprehensive review and aggregation of available data using the Genbank database of Genera of Phytopathogenic Fungi (GOPHY). Through our evaluation, we observed substantial gains in information by analyzing shared data among nodes connecting Genbank specimen records, resulting in impressive increments ranging from 2% to a remarkable 60%. Our approach empowers users to make precise, straightforward, and accurate assessments of the context associated to results, facilitated by two metrics that gauge the current level of data annotation and the potential information gain achievable following our evaluation.
engineering, electrical & electronic,computer science, information systems
What problem does this paper attempt to address?