Identification of Secreted Proteins from Malaria Protozoa with Few Features

Qingwen Li,Benzhi Dong,Donghua Wang,Sui Wang
DOI: https://doi.org/10.1109/access.2020.2994206
IF: 3.9
2020-01-01
IEEE Access
Abstract:Secreted proteins from malaria protozoa play a critical role in the research and development of antimalarial drugs. Therefore, the precise identification of secreted proteins from malaria protozoa is invaluable. Though biochemical experiments can be used to address this task, this method is time-consuming and costly. By extracting the feature vector of a protein, machine learning technology can be used to forecast the functional and structural properties of the protein. Because high-dimensional features may result in increased redundant information and high-dimensional disasters or over-fitting, to train the machine learning model using initial high-dimensional features performs unsatisfactorily in practice. To address this problem, the three features of postivecharger.postivecharger.gap2, Xc1, and Chydrophobicity_ENGD860101.1.residue50 were used to classify random forests. The accuracy was 86.111% in 10-fold cross-validation. The prediction accuracy of adjusting parameters c and g by a support vector machine was 88.8889%.
What problem does this paper attempt to address?