A two-phase bio-NER system based on integrated classifiers and multiagent strategy.

Lishuang Li,Wenting Fan,Degen Huang
DOI: https://doi.org/10.1109/TCBB.2013.106
2013-01-01
Abstract:Biomedical Named Entity Recognition (Bio-NER) is a fundamental step in biomedical text mining. This paper presents a two-phase Bio-NER model targeting at JNLPBA task. Our two-phase method divides the task into two subtasks: Named Entity Detection (NED) and Named Entity Classification (NEC). The NED subtask is accomplished based on the two-layer stacking method in the first phase,where named entities (NEs) are distinguished from non-named-entities (NNEs) in biomedical literatures without identifying their types. Then six classifiers are constructed by four toolkits (CRF++, YamCha, Maximum Entropy, Mallet) with different training methods and integrated based on the two layer stacking method. In the second phase for the NEC subtask, the multi-agent strategy is introduced to determine the correct entity type for entities identified in the first phase. The experiment results show that the presented approach can achieve an F-score of 76.06%, which outperforms most of the state-of-the-art systems.
What problem does this paper attempt to address?