FunEffector-Pred: Identification of Fungi Effector by Activate Learning and Genetic Algorithm Sampling of Imbalanced Data

Chao Wang,Pingping Wang,Shuguang Han,Lida Wang,Yuming Zhao,Liran Juan
DOI: https://doi.org/10.1109/access.2020.2982410
IF: 3.9
2020-01-01
IEEE Access
Abstract:Fungal pathogens have evolved the ability to cause serious plant diseases and threaten the world food security. Fungal effectors are proteins that exploit the host cellular functions to facilitate infection. Effector identification is crucial for disease control in crops and to understand plant-pathogen interactions. However, fungal effector identification has been challenging as most fungal effectors lack of consensus motifs and data imbalance problem. In this study, a fungal effector predictor was designed to effectively learn from an imbalanced dataset. A granular support vector-based under-sampling (GSV-US) strategy combined with a genetic algorithm was used for majority class sampling. When evaluating on an independent test dataset, the FunEffector-Pred significantly outperformed the existing predictors for fungal effector identification. Several informative feature patterns, such as the patterns of Ile, Gly, Val, Leu and Thr, as well as the combination of aromatic amino acids with positively-charged amino acids, are reported for fungal effector identification for the first time.
What problem does this paper attempt to address?