Abstract:Protein-protein interactions (PPI) play a key role in various aspects of the structural and functional organization of the cell. Knowledge about them unveils the molecular mechanisms of biological processes. A number of databases such as MINT (Zanzoni et al., 2002), BIND (Bader et al., 2003), and DIP (Xenarios et al., 2002) have been created to store protein interaction information in structured and standard formats. However, the amount of biomedical literature regarding protein interactions is increasing rapidly and it is difficult for interaction database curators to detect and curate protein interaction information manually. Thus, most of the protein interaction information remains hidden in the text of the papers in the literature. Therefore, automatic extraction of protein interaction information from biomedical literature has become an important research area. Existing PPI works can be roughly divided into three categories: Manual pattern engineering approaches, Grammar engineering approaches and Machine learning approaches. Manual pattern engineering approaches define a set of rules for possible textual relationships, called patterns, which encode similar structures in expressing relationships. The SUISEKI system uses regular expressions, with probabilities that reflect the experimental accuracy of each pattern to extract interactions into predefined frame structures (Blaschke & Valencia, 2002). Ono et al. manually defined a set of rules based on syntactic features to preprocess complex sentences, with negation structures considered as well (Ono et al., 2001). The BioRAT system uses manually engineered templates that combine lexical and semantic information to identify protein interactions (Corney et al., 2004). Such manual pattern engineering approaches for information extraction are very hard to scale up to large document collections since they require labor-intensive and skilldependent pattern engineering. Grammar engineering approaches use manually generated specialized grammar rules that perform a deep parse of the sentences. Sekimizu et al. used shallow parser, EngCG, to generate syntactic, morphological, and boundary tags (Sekimizu et al., 1998). Based on the tagging results, subjects and objects were recognized for the most frequently used verbs. Fundel et al. proposed RelEx based on the dependency parse trees to extract relations (Fundel et al., 2007). Machine learning techniques for extracting protein interaction information have gained interest in the recent years. In most recent work on machine learning for PPI extraction, the PPI extraction task is casted as learning a decision function that determines for each

PPIExtractor: a protein interaction extraction and visualization system for biomedical literature.

PPIExtractor: A protein-protein interaction Extractor for biomédical literature

BioPPIExtractor: A protein-protein interaction extraction system for biomedical literature

mPPI: a database extension to visv-ilize structural interactome in a one-to-many manner

Protein-Protein Interactions Extraction From Biomedical Literatures

PPLook: an Automated Data Mining Tool for Protein-Protein Interaction.

Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection

BioPPISVMExtractor: a protein-protein interaction extractor for biomedical literature using SVM and rich feature sets.

The Protein-Protein Interaction extraction based on full texts

PPICurator: a tool for extracting comprehensive protein-protein interaction information.

PPICurator: A Tool for Extracting Comprehensive Protein–Protein Interaction Information

Protein-protein interaction extraction from biomedical literatures based on a combined kernel

Protein-protein interaction extraction from bio-literature with compact features and data sampling strategy

Extracting Interactions Between Proteins from the Literature

A Hybrid Protein-Protein Interaction Triple Extraction Method for Biomedical Literature.

Extracting Protein-Protein Interactions (PPIs) from Biomedical Literature using Attention-based Relational Context Information

CPIExtract: A software package to collect and harmonize small molecule and protein interactions

PRED_PPI: a Server for Predicting Protein-Protein Interactions Based on Sequence Data with Probability Assignment

PPIPP: an Online Protein-Protein Interaction Network Prediction and Analysis Platform.

Design and Implementation of an Integrated PPI System

PPI-IRO: a Two-Stage Method for Protein-Protein Interaction Extraction Based on Interaction Relation Ontology.