Gene name normalization based on extended semantic similarity

HU Yuncui,LIN Hongfei,YANG Zhihao
DOI: https://doi.org/10.3778/j.issn.1002-8331.2011.35.036
2011-01-01
Abstract:In this paper,a normalization method based on extended semantic similarity is presented to resolve the problem that description of gene symbols in biomedical databases is not rich and complete so that it is hard to make a choice from different gene symbols for the ambiguous term.In this method,extended semantic information is extracted for each gene symbol from gene ontology and MEDLINE abstracts,and the unique identifier which expresses the actual meaning of the named entities is determined depending on the similarity of the context information and extended semantic description.The experiment on BioCreative II gene normalization task achieves an F-measure performance of 81.2%(precision:80% recall:82.4%).The experimental result shows that the method based on extended semantic similarity can apply to gene named entities normalization.
What problem does this paper attempt to address?