Bsqa: Integrated Text Mining Using Entity Relation Semantics Extracted from Biological Literature of Insects

Xin He,Yanen Li,Radhika Khetani,Barry Sanders,Yue Lu,Xu Ling,ChengXiang Zhai,Bruce Schatz
DOI: https://doi.org/10.1093/nar/gkq544
IF: 14.9
2010-01-01
Nucleic Acids Research
Abstract:Text mining is one promising way of extracting information automatically from the vast biological literature. To maximize its potential, the knowledge encoded in the text should be translated to some semantic representation such as entities and relations, which could be analyzed by machines. But large-scale practical systems for this purpose are rare. We present BeeSpace question/answering (BSQA) system that performs integrated text mining for insect biology, covering diverse aspects from molecular interactions of genes to insect behavior. BSQA recognizes a number of entities and relations in Medline documents about the model insect, Drosophila melanogaster. For any text query, BSQA exploits entity annotation of retrieved documents to identify important concepts in different categories. By utilizing the extracted relations, BSQA is also able to answer many biologically motivated questions, from simple ones such as, which anatomical part is a gene expressed in, to more complex ones involving multiple types of relations. BSQA is freely available at http://www.beespace.uiuc.edu/QuestionAnswer.
What problem does this paper attempt to address?