POTN: A Human Leukocyte Antigen-A2 Immunogenic Peptides Screening Model and Its Applications in Tumor Antigens Prediction

Qingqing Meng,Yahong Wu,Xinghua Sui,Jingjie Meng,Tingting Wang,Yan Lin,Zhiwei Wang,Xiuman Zhou,Yuanming Qi,Jiangfeng Du,Yanfeng Gao
DOI: https://doi.org/10.3389/fimmu.2020.02193
IF: 7.3
2020-10-07
Frontiers in Immunology
Abstract:Whole genome/exome sequencing data for tumors are now abundant, and many tumor antigens, especially mutant antigens (neoantigens), have been identified for cancer immunotherapy. However, only a small fraction of the peptides from these antigens induce cytotoxic T cell responses. Therefore, efficient methods to identify these antigenic peptides are crucial. The current models of major histocompatibility complex (MHC) binding and antigenic prediction are still inaccurate. In this study, 360 9-mer peptides with verified immunological activity were selected to construct a prediction of tumor neoantigen (POTN) model, an immunogenic prediction model specifically for the human leukocyte antigen-A2 allele. Based on the physicochemical properties of amino acids, such as the residue propensity, hydrophobicity, and organic solvent/water, we found that the predictive capability of POTN is superior to that of the prediction programs SYPEITHI, IEDB, and NetMHCpan 4.0. We used POTN to screen peptides for the cancer-testis antigen located on the X chromosome, and we identified several peptides that may trigger immunogenicity. We synthesized and measured the binding affinity and immunogenicity of these peptides and found that the accuracy of POTN is higher than that of NetMHCpan 4.0. Identifying the properties related to the T cell response or immunogenicity paves the way to understanding the MHC/peptide/T cell receptor complex. In conclusion, POTN is an efficient prediction model for screening high-affinity immunogenic peptides from tumor antigens, and thus provides useful information for developing cancer immunotherapy.
immunology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the prediction accuracy of the immunogenicity of tumor antigen peptides (especially neo - antigen peptides). Specifically, the existing major histocompatibility complex (MHC) binding and antigenicity prediction models still have the problem of inaccuracy in identifying antigen peptides that can trigger cytotoxic T - cell responses. Therefore, developing an efficient and accurate prediction model is crucial for the development of cancer immunotherapy. ### Background and Objectives of the Paper With the development of whole - genome/exome sequencing technologies, tumor antigens, especially mutant antigens (neo - antigens), have been widely used in cancer immunotherapy. However, only a very small number of peptides from these antigens can induce cytotoxic T - cell responses. Therefore, effective methods to identify these antigen peptides become very crucial. The current MHC binding and antigenicity prediction models are still not accurate enough, mainly due to the following reasons: 1. **Impure Datasets**: Many non - immunogenic peptides are randomly selected without experimental verification, resulting in a high false - negative rate. 2. **Pan - specific Methods**: Most prediction programs are based on pan - specific methods and do not distinguish different HLA alleles, which reduces the prediction accuracy of antigen peptides for specific MHC alleles. ### Research Methods To solve the above problems, the author constructed a prediction model named POTN (Peptide of Tumor Neoantigen), which is specifically used for screening immunogenic peptides of the human leukocyte antigen - A2 (HLA - A2) allele. The specific steps are as follows: 1. **Data Collection**: 360 9 - mer peptides were collected from databases such as IEDB, SYFPEITHI and Peptide Database, among which 146 have verified immunological activities and 214 are non - immunological activity peptides. 2. **Feature Selection**: Through literature research and statistical analysis, 28 features significantly related to immunogenicity were selected, including the accessible surface area (ASA) of amino acids, charge value, electron charge index (ECI), hydrophobicity, molecular weight (Mw), etc. 3. **Model Construction**: The POTN model was constructed using the support vector machine (SVM) and the radial basis (Gaussian) kernel function. The regularization parameter (C) was optimized through cross - validation, and the final C value was determined to be 1. 4. **Model Validation**: The prediction performance of the POTN model was verified using an external dataset and compared with existing prediction software (such as SYFPEITHI, IEDB and NetMHCpan 4.0). ### Main Results 1. **Model Performance**: The POTN model showed high prediction ability on both the training set and the test set, with AUCs of 0.773 and 0.748 respectively, and accuracies of 0.653 and 0.701 respectively. 2. **Comparative Analysis**: The prediction performance of the POTN model is better than other existing models, especially in the case of a low false - positive rate. 3. **Application Example**: The POTN model was applied to the cancer - testis antigen (CT - X) dataset, and multiple high - affinity immunogenic peptides were successfully screened out, and their binding affinities and immunogenicities were verified through experiments. ### Conclusions The POTN model provides an efficient method for screening high - affinity immunogenic peptides, which is helpful for the development of cancer immunotherapy. By accurately identifying the characteristics related to T - cell responses or immunogenicity, the POTN model provides a new perspective for understanding the MHC/peptide/T - cell receptor complex.