Identification of adaptor proteins using the ANOVA feature selection technique

Yu-Hao Wang,Yu-Fei Zhang,Ying Zhang,Zhi-Feng Gu,Zhao-Yue Zhang,Hao Lin,Ke-Jun Deng
DOI: https://doi.org/10.1016/j.ymeth.2022.10.008
IF: 4.647
2022-12-01
Methods
Abstract:The adaptor proteins play a crucially important role in regulating lymphocyte activation. Rapid and efficient identification of adaptor proteins is essential for understanding their functions. However, biochemical methods require not only expensive experimental costs, but also long experiment cycles and more personnel. Therefore, a computational method that could accurately identify adaptor proteins is urgently needed. To solve this issue, we developed a classifier that combined the support vector machine (SVM) with the composition of k-Spaced Amino Acid Pairs (CKSAAP) and the amino acid composition (AAC) to identify adaptor proteins. Analysis of variance (ANOVA) was used to select the optimized features which could generate the maximum prediction performance. By examining the proposed model on independent data, we found that the 447 optimized features could achieve an accuracy of 92.39% with an AUC of 0.9766, demonstrating the powerful capabilities of our model. We hope that the proposed model could provide more clues for studying adaptor proteins.
biochemistry & molecular biology,biochemical research methods
What problem does this paper attempt to address?