Extracting Protein-Protein Interactions from MEDLINE using the Hidden Vector State model.

Deyu Zhou,Yulan He,Chee Keong Kwoh
DOI: https://doi.org/10.1504/IJBRA.2008.017164
2008-01-01
Abstract:A major challenge in text mining for biomedicine is automatically extracting protein-protein interactions from the vast amount of biomedical literature. We have constructed an information extraction system based on the Hidden Vector State (HVS) model for protein-protein interactions. The HVS model can be trained using only lightly annotated data whilst simultaneously retaining sufficient ability to capture the hierarchical structure. When applied in extracting protein-protein interactions, we found that it performed better than other established statistical methods and achieved 61.5% in F-score with balanced recall and precision values. Moreover, the statistical nature of the pure data-driven HVS model makes it intrinsically robust and it can be easily adapted to other domains.
What problem does this paper attempt to address?