DUTIR at the BioCreative V CDR Task : Disease Named Entity Recognition and Normalization and the Chemical-Disease Relation Extraction from Biomedical Text

Zhiheng Li,YaYang,Zhihao Yang,Ziwei Zhou,Hongfei Lin
2015-01-01
Abstract:Adverse drug reactions between chemicals and diseases make the topic of chemical-disease relations (CDR) become a focus that receives much concern. In this paper, we introduce our methods used to create our submissions to the BioCreative V CDR subtask, i.e. Disease Named Entity Recognition and Normalization (DNER) and Chemical-Induced Diseases (CID). In our DNER method, firstly, a CRF model with a dictionary is used to recognize disease mentions. Secondly, the dictionary look-up that combines the exact and approximate matching is employed to map disease mentions to disease identifiers. Finally, disambiguation is implemented by choosing a unique disease identifier for an ambiguous disease mention using extended semantic information. Experimental results show that our approach achieves an F-score of 64.46% on the test set of CDR DNER task. Our CID method combines the feature-based kernel and graph kernel. A semi-supervised learning method, Co-Training, is introduced which makes use of the unlabeled data to boost the performance of a classifier. Finally, we use the obtained model to extract the CID relations at the sentence level, and then use some rules to obtain the final results at the abstract level. Our system achieved an F-score of 52% on the development set, and an F-score of 35.52% on the test set of the CID subtask, respectively.
What problem does this paper attempt to address?