Automated validation of genetic variants from large databases: ensuring that variant references refer to the same genomic locations

Mark Y. Tong,Christopher A. Cassa,Isaac S. Kohane
DOI: https://doi.org/10.1093/bioinformatics/btr029
IF: 5.8
2011-01-22
Bioinformatics
Abstract:SUMMARY: Accurate annotations of genomic variants are necessary to achieve full-genome clinical interpretations that are scientifically sound and medically relevant. Many disease associations, especially those reported before the completion of the HGP, are limited in applicability because of potential inconsistencies with our current standards for genomic coordinates, nomenclature and gene structure. In an effort to validate and link variants from the medical genetics literature to an unambiguous reference for each variant, we developed a software pipeline and reviewed 68 641 single amino acid mutations from Online Mendelian Inheritance in Man (OMIM), Human Gene Mutation Database (HGMD) and dbSNP. The frequency of unresolved mutation annotations varied widely among the databases, ranging from 4 to 23%. A taxonomy of primary causes for unresolved mutations was produced.AVAILABILITY: This program is freely available from the web site (http://safegene.hms.harvard.edu/aa2nt/).
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?