A Hybrid Protein-Protein Interaction Triple Extraction Method for Biomedical Literature.

Zhehuan Zhao,Zhihao Yang,Cong Sun,Lei Wang,Hongfei Lin
DOI: https://doi.org/10.1109/bibm.2017.8217886
2017-01-01
Abstract:Protein-protein interaction extraction research can be widely applied to the field of life science research. However, most of the machine learning based methods focus on binary PPI relation extraction, which loses rich relationship type information that is critical to the PPIs study. The rule based open information extraction methods can extract the PPI triple (i.e. "protein1, interaction word, protein2"), but suffers from low recall rate problem. In this paper, we propose a hybrid protein-protein interaction triple extraction method. In this method, firstly, machine learning techniques are used to recognize protein entities and extract relational protein pairs. Then, the syntactic patterns and a dictionary are employed to find out corresponding interaction words that represent the relationships between two proteins. This method obtains an F-score of 40.18% on the AImed corpus, which is much higher than the result achieved by the rule based Stanford open information extraction method.
What problem does this paper attempt to address?