Abstract State Machines for Data-Parallel Computing

Qing Wang

DOI: https://doi.org/10.1007/978-3-642-28279-9_11

Abstract:

What problem does this paper attempt to address?

HLAIImaster: a deep learning method with adaptive domain knowledge predicts HLA II neoepitope immunogenic responses

Qiang Yang,Long Xu,Weihe Dong,Xiaokun Li,Kuanquan Wang,Suyu Dong,Xianyu Zhang,Tiansong Yang,Feng Jiang,Bin Zhang,Gongning Luo,Xin Gao,Guohua Wang

DOI: https://doi.org/10.1093/bib/bbae302

2024-05-23

Abstract:While significant strides have been made in predicting neoepitopes that trigger autologous CD4+ T cell responses, accurately identifying the antigen presentation by human leukocyte antigen (HLA) class II molecules remains a challenge. This identification is critical for developing vaccines and cancer immunotherapies. Current prediction methods are limited, primarily due to a lack of high-quality training epitope datasets and algorithmic constraints. To predict the exogenous HLA class II-restricted peptides across most of the human population, we utilized the mass spectrometry data to profile >223 000 eluted ligands over HLA-DR, -DQ, and -DP alleles. Here, by integrating these data with peptide processing and gene expression, we introduce HLAIImaster, an attention-based deep learning framework with adaptive domain knowledge for predicting neoepitope immunogenicity. Leveraging diverse biological characteristics and our enhanced deep learning framework, HLAIImaster is significantly improved against existing tools in terms of positive predictive value across various neoantigen studies. Robust domain knowledge learning accurately identifies neoepitope immunogenicity, bridging the gap between neoantigen biology and the clinical setting and paving the way for future neoantigen-based therapies to provide greater clinical benefit. In summary, we present a comprehensive exploitation of the immunogenic neoepitope repertoire of cancers, facilitating the effective development of "just-in-time" personalized vaccines.
LRMAHpan: a novel tool for multi-allelic HLA presentation prediction using Resnet-based and LSTM-based neural networks

Xue Mi,Shaohao Li,Zheng Ye,Zhu Dai,Bo Ding,Bo Sun,Yang Shen,Zhongdang Xiao

DOI: https://doi.org/10.3389/fimmu.2024.1478201

IF: 7.3

2024-11-28

Frontiers in Immunology

Abstract:Introduction: The identification of peptides eluted from HLA complexes by mass spectrometry (MS) can provide critical data for deep learning models of antigen presentation prediction and promote neoantigen vaccine design. A major challenge remains in determining which HLA allele eluted peptides correspond to. Methods: To address this, we present a tool for prediction of multiple allele (MA) presentation called LRMAHpan, which integrates LSTM network and ResNet_CA network for antigen processing and presentation prediction. We trained and tested the LRMAHpan BA (binding affinity) and the LRMAHpan AP (antigen processing) models using mass spectrometry data, subsequently combined them into the LRMAHpan PS (presentation score) model. Our approach is based on a novel pHLA encoding method that enables the integration of neoantigen prediction tasks into computer vision methods. This method aggregates MA data into a multichannel matrix and incorporates peptide sequences to efficiently capture binding signals. Results: LRMAHpan outperforms standard predictors such as NetMHCpan 4.1, MHCflurry 2.0, and TransPHLA in terms of positive predictive value (PPV) when applied to MA data. Additionally, it can accommodate peptides of variable lengths and predict HLA class I and II presentation. We also predicted neoantigens in a cohort of metastatic melanoma patients, identifying several shared neoantigens. Discussion: Our results demonstrate that LRMAHpan significantly improves the accuracy of antigen presentation predictions.

immunology
HLAIIPred: Cross-Attention Mechanism for Modeling the Interaction of HLA Class II Molecules with Peptides

Mojtaba Haghighatlari,Nicholas Marze,Robert Joseph Seward,Andrew Ciarla,Santosh Dhule,Rachel Hindin,Jennifer Calderini,Benjamin Keenan,Sarah Hall-Swan,Timothy P Hickling,Eric Bennett,Brajesh Rai,Sophie Tourdot

DOI: https://doi.org/10.1101/2024.10.01.616078

2024-10-03

Abstract:We introduce HLAIIPred, a deep learning model to predict peptides presented by class II human leukocyte antigens (HLAII) on the surface of antigen presenting cells. HLAIIPred is trained using a Transformer-based neural network and a dataset comprising of HLAII-presented peptides identified by mass spectrometry. In addition to predicting peptide presentation, the model can also provide important insights into peptide-HLAII interactions by identifying core peptide residues that form such interactions. We evaluate the performance of HLAIIPred on three different tasks, peptide presentation in monoallelic samples, immunogenicity prediction of therapeutic antibodies, and neoantigen prioritization for cancer immunotherapy. Additionally, we created a new dataset of biotherapeutics HLAII peptides presented by human dendritic cells. This data is used to develop screening strategies to predict the unwanted immunogenic segments of therapeutic antibodies by HLAII presentation models. HLAIIPred demonstrates superior or equivalent performance when compared to the latest models across all evaluated benchmark datasets. We achieve a 16% increase in prediction of presented peptides compared to the second-best model on a set of unseen peptides presented by less frequent alleles. The model also improves the area under the precision-recall curve by 3% for distinguishing between immunogenic and non-immunogenic antibodies. We show that HLAIIPred can identify epitopes in therapeutic antibodies and prioritize neoantigens with high accuracy.

Bioinformatics
Deep Learning-Enhanced MHC-II Presentation Prediction and Peptidome Deconvolution

Juntao Deng,Min Liu

DOI: https://doi.org/10.1007/978-3-031-23198-8_17

2022-01-01

Abstract:Antigen-presenting cells can elicit a CD4(+) T cell response by displaying foreign peptides on the surface. Identifying such peptides requires robust prediction of the binding and presentation corresponding to peptides and major histocompatibility complexes class II (MHC-II) molecules. However, numerous experimental data suffer from inexact supervision, and the open conformation of MHC-II molecules leads to a complex peptide binding pattern. Though current prediction methods have significantly pushed the development of cancer vaccines and immunotherapies, an urgent desire for better approaches still exists. We practice the powerful multi-head self-attention technique for MHC-II-restricted peptidome deconvolution and antigen presentation prediction problems. According to binding motifs reflected by eluted ligands, the novel expert voting-based deconvolution strategy ensures a reliable MHC-II assignment. Driven by massive trusty annotated peptidome data, our method overwhelms the start-of-the-art MHC-II presentation prediction method, NetMHCIIpan4.0, on two independent single allelic datasets. All these results have demonstrated that our method can boost the performance of MHC-II presentation prediction and peptidome deconvolution.
DeepHLApan: A Deep Learning Approach for Neoantigen Prediction Considering Both HLA-Peptide Binding and Immunogenicity

Jingcheng Wu,Wenzhe Wang,Jiucheng Zhang,Binbin Zhou,Wenyi Zhao,Zhixi Su,Xun Gu,Jian Wu,Zhan Zhou,Shuqing Chen

DOI: https://doi.org/10.3389/fimmu.2019.02559

2024-01-01

Abstract:Neoantigens play important roles in cancer immunotherapy. Current methods used for neoantigen prediction focus on the binding between human leukocyte antigens (HLAs) and peptides, which is insufficient for high-confidence neoantigen prediction. In this study, we apply deep learning techniques to predict neoantigens considering both the possibility of HLA-peptide binding (binding model) and the potential immunogenicity (immunogenicity model) of the peptide-HLA complex (pHLA). The binding model achieves comparable performance with other well-acknowledged tools on the latest Immune Epitope Database (IEDB) benchmark datasets and an independent mass spectrometry (MS) dataset. The immunogenicity model could significantly improve the prediction precision of neoantigens. The further application of our method to the mutations with pre-existing T-cell responses indicating its feasibility in clinical application. DeepHLApan is freely available at https://github.com/jiujiezz/deephlapan and http://biopharm.zju.edu.cn/deephlapan.
HLA class I binding prediction via convolutional neural networks

Yeeleng S Vang,Xiaohui Xie

DOI: https://doi.org/10.1093/bioinformatics/btx264

IF: 5.8

2017-04-21

Bioinformatics

Abstract:MOTIVATION: Many biological processes are governed by protein-ligand interactions. One such example is the recognition of self and non-self cells by the immune system. This immune response process is regulated by the major histocompatibility complex (MHC) protein which is encoded by the human leukocyte antigen (HLA) complex. Understanding the binding potential between MHC and peptides can lead to the design of more potent, peptide-based vaccines and immunotherapies for infectious autoimmune diseases.RESULTS: We apply machine learning techniques from the natural language processing (NLP) domain to address the task of MHC-peptide binding prediction. More specifically, we introduce a new distributed representation of amino acids, name HLA-Vec, that can be used for a variety of downstream proteomic machine learning tasks. We then propose a deep convolutional neural network architecture, name HLA-CNN, for the task of HLA class I-peptide binding prediction. Experimental results show combining the new distributed representation with our HLA-CNN architecture achieves state-of-the-art results in the majority of the latest two Immune Epitope Database (IEDB) weekly automated benchmark datasets. We further apply our model to predict binding on the human genome and identify 15 genes with potential for self binding.AVAILABILITY AND IMPLEMENTATION: Codes to generate the HLA-Vec and HLA-CNN are publicly available at: https://github.com/uci-cbcl/HLA-bind .CONTACT: xhx@ics.uci.edu.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism

Yilin Ye,Jian Wang,Yunwan Xu,Yi Wang,Youdong Pan,Qi Song,Xing Liu,Ji Wan

DOI: https://doi.org/10.1186/s12859-020-03946-z

IF: 3.307

2021-01-06

BMC Bioinformatics

Abstract:Abstract Background Accurate prediction of binding between class I human leukocyte antigen (HLA) and neoepitope is critical for target identification within personalized T-cell based immunotherapy. Many recent prediction tools developed upon the deep learning algorithms and mass spectrometry data have indeed showed improvement on the average predicting power for class I HLA-peptide interaction. However, their prediction performances show great variability over individual HLA alleles and peptides with different lengths, which is particularly the case for HLA-C alleles due to the limited amount of experimental data. To meet the increasing demand for attaining the most accurate HLA-peptide binding prediction for individual patient in the real-world clinical studies, more advanced deep learning framework with higher prediction accuracy for HLA-C alleles and longer peptides is highly desirable. Results We present a pan-allele HLA-peptide binding prediction framework—MATHLA which integrates bi-directional long short-term memory network and multiple head attention mechanism. This model achieves better prediction accuracy in both fivefold cross-validation test and independent test dataset. In addition, this model is superior over existing tools regarding to the prediction accuracy for longer ligand ranging from 11 to 15 amino acids. Moreover, our model also shows a significant improvement for HLA-C-peptide-binding prediction. By investigating multiple-head attention weight scores, we depicted possible interaction patterns between three HLA I supergroups and their cognate peptides. Conclusion Our method demonstrates the necessity of further development of deep learning algorithm in improving and interpreting HLA-peptide binding prediction in parallel to increasing the amount of high-quality HLA ligandome data.

biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
A novel HLA Class II presentation prediction algorithm deciphers immunogenic CD4 epitopes specific to KRAS G12C

Daniel Sprague,Meghan G Hart,Joshua Klein,Sonia Kounlavouth,Rahulsimham Vegesna,Melissa Rotunno,Lauren D Kraemer-Tardif,Rita Zhou,Lindsey Kemp,Adrienne C Greeene,Joshua Araya,Alexis Mantilla,Bukola Adeoye,Calixto Dominguez,Andrew R Ferguson,Melissa L Johnson,Matthew J Davis,Monica Lane,Christine D Palmer,Karin Jooss,Ankur Dhanik

DOI: https://doi.org/10.1101/2024.12.06.627073

2024-12-10

Abstract:Accurate prediction of peptide presentation by HLA molecules is important for generation of effective individualized cancer vaccines and immunotherapies. While presentation prediction algorithms for HLA class I have been successfully applied in the context of such therapies, improved prediction algorithms for class II are needed. EDGE-II is a novel algorithm based on a protein large language model that has a learned allele deconvolution network trained on existing and new immunopeptidomics data. It delivers state-of-the-art performance on prediction of peptide presentation by HLA class II and immunogenicity elicited by CD4 T-cell epitopes. In a patient with a KRAS G12C positive tumor treated with a KRAS G12C targeting immunotherapy, EDGE-II identified KRAS G12C class II neoantigens that elicited clonally expanded CD4 T cells with cytotoxic transcriptional profiles post-vaccination. EDGE-II could play an important role in the development of effective cancer immunotherapies by elucidating an enriched understanding of the immunopeptidome.

Bioinformatics
Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

Jeffrey K Weber,Joseph A Morrone,Seung-Gu Kang,Leili Zhang,Lijun Lang,Diego Chowell,Chirag Krishna,Tien Huynh,Prerana Parthasarathy,Binquan Luan,Tyler J Alban,Wendy D Cornell,Timothy A Chan,Seung-gu Kang

DOI: https://doi.org/10.1093/bib/bbad504

IF: 9.5

2024-01-19

Briefings in Bioinformatics

Abstract:Immunologic recognition of peptide antigens bound to class I major histocompatibility complex (MHC) molecules is essential to both novel immunotherapeutic development and human health at large. Current methods for predicting antigen peptide immunogenicity rely primarily on simple sequence representations, which allow for some understanding of immunogenic features but provide inadequate consideration of the full scale of molecular mechanisms tied to peptide recognition. We here characterize contributions that unsupervised and supervised artificial intelligence (AI) methods can make toward understanding and predicting MHC(HLA-A2)-peptide complex immunogenicity when applied to large ensembles of molecular dynamics simulations. We first show that an unsupervised AI method allows us to identify subtle features that drive immunogenicity differences between a cancer neoantigen and its wild-type peptide counterpart. Next, we demonstrate that a supervised AI method for class I MHC(HLA-A2)-peptide complex classification significantly outperforms a sequence model on small datasets corrected for trivial sequence correlations. Furthermore, we show that both unsupervised and supervised approaches reveal determinants of immunogenicity based on time-dependent molecular fluctuations and anchor position dynamics outside the MHC binding groove. We discuss implications of these structural and dynamic immunogenicity correlates for the induction of T cell responses and therapeutic T cell receptor design.

biochemical research methods,mathematical & computational biology
DeepHLApan: A Deep Learning Approach for High-Confidence Neoantigen Prediction

Jingcheng Wu,Wenzhe Wang,Jiucheng Zhang,Binbin Zhou,Wenyi Zhao,Zhixi Su,Xun Gu,Jian Wu,Zhan Zhou,Shuqing Chen

DOI: https://doi.org/10.2139/ssrn.3365058

2019-01-01

Abstract:Background: Neoantigens are the most widely recognized elements to distinguish cancer and normal cells and consequently play important roles in cancer immunotherapy. Current methods used for neoantigen prediction focus on the binding between human leukocyte antigens (HLAs) and peptides, which is insufficient for high-confidence neoantigen prediction. Methods: We apply deep learning techniques to predict neoantigens considering both the possibility of mutant peptide presentation (binding model) and the potential immunogenicity (immunogenicity model) of the peptide-HLA complex (pHLA) present on the cell surface. Findings: The binding model achieves performance comparable to or even better than that of other well-acknowledged tools with the latest Immune Epitope Database (IEDB) benchmark datasets. Using the immunogenicity model, we demonstrate that limited immunogenicity data could significantly improve the identification of high-confidence neoantigens. We further apply our method to mutations with pre-existing T-cell responses and ranked most of them (69%) in the top 20 under an expression threshold of transcripts per million (TPM)>2. Interpretation: The process of neoantigens inducing T cell response is complex and the immunogenicity of pHLA should be considered for high-confidence neoantigen prediction. Funding Statement: This work has been supported by the National Key R&D Program of China (Grant No. 437 2017YFC0908600), the Zhejiang Provincial Natural Science Foundation of China 438 (Grant No. LY19H300003), and the Fundamental Research Funds for the Central 439 Universities of China. Declaration of Interests: The authors declare that they have no competing interests. Ethics Approval Statement: Not needed.
Improvement of Neoantigen Identification Through Convolution Neural Network

Qing Hao,Ping Wei,Yang Shu,Yi-Guan Zhang,Heng Xu,Jun-Ning Zhao

DOI: https://doi.org/10.3389/fimmu.2021.682103

IF: 7.3

2021-05-25

Frontiers in Immunology

Abstract:Accurate prediction of neoantigens and the subsequent elicited protective anti-tumor response are particularly important for the development of cancer vaccine and adoptive T-cell therapy. However, current algorithms for predicting neoantigens are limited by in vitro binding affinity data and algorithmic constraints, inevitably resulting in high false positives. In this study, we proposed a deep convolutional neural network named APPM (antigen presentation prediction model) to predict antigen presentation in the context of human leukocyte antigen (HLA) class I alleles. APPM is trained on large mass spectrometry (MS) HLA-peptides datasets and evaluated with an independent MS benchmark. Results show that APPM outperforms the methods recommended by the immune epitope database (IEDB) in terms of positive predictive value (PPV) (0.40 vs. 0.22), which will further increase after combining these two approaches (PPV = 0.51). We further applied our model to the prediction of neoantigens from consensus driver mutations and identified 16,000 putative neoantigens with hallmarks of ‘drivers’.

immunology
TripHLApan: predicting HLA molecules binding peptides based on triple coding matrix and transfer learning

Meng Wang,Chuqi Lei,Jianxin Wang,Yaohang Li,Min Li

DOI: https://doi.org/10.1093/bib/bbae154

IF: 9.5

2024-03-27

Briefings in Bioinformatics

Abstract:Abstract Human leukocyte antigen (HLA) recognizes foreign threats and triggers immune responses by presenting peptides to T cells. Computationally modeling the binding patterns between peptide and HLA is very important for the development of tumor vaccines. However, it is still a big challenge to accurately predict HLA molecules binding peptides. In this paper, we develop a new model TripHLApan for predicting HLA molecules binding peptides by integrating triple coding matrix, BiGRU + Attention models, and transfer learning strategy. We have found the main interaction site regions between HLA molecules and peptides, as well as the correlation between HLA encoding and binding motifs. Based on the discovery, we make the preprocessing and coding closer to the natural biological process. Besides, due to the input being based on multiple types of features and the attention module focused on the BiGRU hidden layer, TripHLApan has learned more sequence level binding information. The application of transfer learning strategies ensures the accuracy of prediction results under special lengths (peptides in length 8) and model scalability with the data explosion. Compared with the current optimal models, TripHLApan exhibits strong predictive performance in various prediction environments with different positive and negative sample ratios. In addition, we validate the superiority and scalability of TripHLApan’s predictive performance using additional latest data sets, ablation experiments and binding reconstitution ability in the samples of a melanoma patient. The results show that TripHLApan is a powerful tool for predicting the binding of HLA-I and HLA-II molecular peptides for the synthesis of tumor vaccines. TripHLApan is publicly available at https://github.com/CSUBioGroup/TripHLApan.git.

biochemical research methods,mathematical & computational biology
High-Throughput Prediction of MHC Class I and II Neoantigens with MHCnuggets

Xiaoshan M. Shao,Rohit Bhattacharya,Justin Huang,I.K. Ashok Sivakumar,Collin Tokheim,Lily Zheng,Dylan Hirsch,Benjamin Kaminow,Ashton Omdahl,Maria Bonsack,Angelika B. Riemer,Victor E. Velculescu,Valsamo Anagnostou,Kymberleigh A. Pagel,Rachel Karchin

DOI: https://doi.org/10.1158/2326-6066.CIR-19-0464

IF: 10.1

2020-03-02

Cancer Immunology Research

Abstract:Computational prediction of binding between neoantigen peptides and major histocompatibility complex (MHC) proteins can be used to predict patient response to cancer immunotherapy. Current neoantigen predictors focus on in silico estimation of MHC binding affinity and are limited by low predictive value for actual peptide presentation, inadequate support for rare MHC alleles, and poor scalability to high-throughput data sets. To address these limitations, we developed MHCnuggets, a deep neural network method that predicts peptide–MHC binding. MHCnuggets can predict binding for common or rare alleles of MHC class I or II with a single neural network architecture. Using a long short-term memory network (LSTM), MHCnuggets accepts peptides of variable length and is faster than other methods. When compared with methods that integrate binding affinity and MHC-bound peptide (HLAp) data from mass spectrometry, MHCnuggets yields a 4-fold increase in positive predictive value on independent HLAp data. We applied MHCnuggets to 26 cancer types in The Cancer Genome Atlas, processing 26.3 million allele–peptide comparisons in under 2.3 hours, yielding 101,326 unique predicted immunogenic missense mutations (IMM). Predicted IMM hotspots occurred in 38 genes, including 24 driver genes. Predicted IMM load was significantly associated with increased immune cell infiltration ( P < 2 x 10 –16 ), including CD8 + T cells. Only 0.16% of predicted IMMs were observed in more than 2 patients, with 61.7% of these derived from driver mutations. Thus, we describe a method for neoantigen prediction and its performance characteristics and demonstrate its utility in data sets representing multiple human cancers.

oncology,immunology
Machine learning application to predict binding affinity between peptide containing non-canonical amino acids and HLA0201

Shan Jiang,Zhaoqian Su,Nathaniel Bloodworth,Yunchao Liu,Cristina Martina,David G. Harrison,Jens Meiler

DOI: https://doi.org/10.1101/2024.11.19.624425

2024-11-21

Abstract:Class 1 major histocompatibility complexes (MHC-I), encoded by the highly polymorphic HLA-A, HLA-B, and HLA-C genes in humans, are expressed on all nucleated cells. Both self and foreign proteins are processed to peptides of 8 to 10 amino acids, loaded into MCH-1 within the endoplasmic reticulum and then presented on the cell surface. Foreign peptides presented in this fashion activate CD8+ T cells and their immunogenicity correlates with their affinity for the MHC-1 binding groove. Thus, predicting antigen binding affinity for MHC-I is a valuable tool for identifying potentially immunogenic antigens. While quite a few predictors for MHC-I binding exist, there are no currently available tools that can predict antigen/MHC-I binding affinity for antigens with explicitly labeled post-translational modifications or unusual/non-canonical amino acids (NCAAs). However, such modifications are increasingly recognized as critical mediators of peptide immunogenicity. In this work, we propose a machine learning application that quantifies the binding affinity of epitopes containing NCAAs to MHC-I and compares its performance with other commonly used regressors. Our model demonstrates robust performance, with 5-fold cross-validation yielding an R2 value of 0.477 and a root-mean-square error (RMSE) of 0.735, indicating strong predictive capability for peptides with NCAAs. This work provides a valuable tool for the computational design and optimization of peptides incorporating NCAAs, potentially accelerating the development of novel peptide-based therapeutics with enhanced properties and efficacy.

Biology
Reorganization of the Right Arcuate Fasciculus Following Left Arcuate Fasciculus Resection in Children With Intractable Epilepsy

D. Goradia,H. Chugani,R. Govindan,M. Behen,C. Juhász,S. Sood

DOI: https://doi.org/10.1177/0883073811402689

2011-05-06

Journal of Child Neurology

Abstract:The authors evaluated postsurgical reorganization of the arcuate fasciculus longitudinally using diffusion tensor imaging in 10 children with intractable epilepsy, whose resections included the left arcuate fasciculus. Evaluation of fractional anisotropy before and after surgery (mean follow-up: 7.5 months) showed a significant increase (P = .002) in the right arcuate fasciculus during follow-up. There was marked enlargement of the right arcuate fasciculus postsurgically in 8 patients. The change in right arcuate fasciculus fractional anisotropy values showed a positive correlation with interval between resection and postsurgical magnetic resonance imaging (MRI) (P = .044). Comparison of 10 age-matched controls to patients pre- and postsurgery showed significantly reduced presurgery fractional anisotropy in the left (P = .018) and right (P = .036) arcuate fasciculus and no difference in postsurgery fractional anisotropy in the right arcuate fasciculus (P = .399) in patients. These findings suggest a compensatory reorganization in the right arcuate fasciculus in children with intractable epilepsy following left arcuate fasciculus resection.
TLimmuno2: predicting MHC class II antigen immunogenicity through transfer learning

Guangshuai Wang,Tao Wu,Wei Ning,Kaixuan Diao,Xiaoqin Sun,Jinyu Wang,Chenxu Wu,Jing Chen,Dongliang Xu,Xue-Song Liu

DOI: https://doi.org/10.1093/bib/bbad116

IF: 9.5

2023-03-25

Briefings in Bioinformatics

Abstract:Major histocompatibility complex (MHC) class II molecules play a pivotal role in antigen presentation and CD4+ T cell response. Accurate prediction of the immunogenicity of MHC class II-associated antigens is critical for vaccine design and cancer immunotherapies. However, current computational methods are limited by insufficient training data and algorithmic constraints, and the rules that govern which peptides are truly recognized by existing T cell receptors remain poorly understood. Here, we build a transfer learning-based, long short-term memory model named 'TLimmuno2' to predict whether epitope-MHC class II complex can elicit T cell response. Through leveraging binding affinity data, TLimmuno2 shows superior performance compared with existing models on independent validation datasets. TLimmuno2 can find real immunogenic neoantigen in real-world cancer immunotherapy data. The identification of significant MHC class II neoantigen-mediated immunoediting signal in the cancer genome atlas pan-cancer dataset further suggests the robustness of TLimmuno2 in identifying really immunogenic neoantigens that are undergoing negative selection during cancer evolution. Overall, TLimmuno2 is a powerful tool for the immunogenicity prediction of MHC class II presented epitopes and could promote the development of personalized immunotherapies.

biochemical research methods,mathematical & computational biology
Abstract 6177: Machine learning enables prediction of ADC targets from whole slide H&E images

Zachary Ryan McCaw,Anna Shcherbina,Yajas Shah,Philip Tagari,Daphne Koller,Christopher Probert,insitro Research Team

DOI: https://doi.org/10.1158/1538-7445.am2024-6177

IF: 11.2

2024-03-31

Cancer Research

Abstract:Background: The efficacy of antibody-drug conjugates (ADCs) depends on the expression and specificity of the target antigen by tumor cells. Although H&E-stained slides are routinely collected during cancer care, specialized IHC staining is typically required to ascertain antigen expression. Such stains are not always available or readily deployed. The need to perform separate IHC tests for each candidate ADC may burden clinical labs and can hinder access to care in resource-limited settings. Here we develop an ensemble of machine learning models to accurately predict the expression of 166 distinct ADC targets directly from H&E images. Methods: For each ADC-targeted gene, patients with copy-number amplifications (CNAs) were identified from somatic whole exome sequencing. Genes differentially expressed in patients with CNAs were identified from bulk transcriptomics. For each gene, an expression signature was developed based on the expression levels of differentially upregulated genes. Next, whole-slide, H&E-stained histopathology images were embedded into a lower-dimensional representation via a transformer model trained with self-supervised learning. Neural networks were developed to predict a patient's probability of having a CNA in an ADC-targeted gene, as indicated by an expression signature exceeding the p90. All evaluation metrics were ascertained by 5-fold cross-validation, with training and evaluation on independent patients. Results: For each of the 166 ADC-targeted genes, a median of 154 patients were found to harbor a CNA, and the expression signature included a median of 180 genes. In all cases, patients with CNAs had significantly higher expression signatures than those without. For predicting likely CNA status directly from H&E histopathology images, the mean AUROC was 0.888 (95% CI, 0.876-0.900) and the mean AUPRC was 0.571 (95% CI, 0.531-0.611). Among the 166 ADC-targeted genes, the AUROC exceeded 0.95 for 31.3%, 0.8 for 80.7%, and 0.725 for 100%. The best-predicted ADC target was SLC7A5 (AUROC: 0.995 [95% CI, 0.994-0.998]; AUPRC: 0.967 [95% CI, 0.963-0.976]). Conclusion: We have developed models that accurately predict the likely expression of ADC targets based solely on H&E images. The ability to accurately discern the presence of ADC antigens from H&E images has numerous potential applications, including cohort refinement, computer-aided diagnosis, and personalized treatment planning. Citation Format: Zachary Ryan McCaw, Anna Shcherbina, Yajas Shah, insitro Research Team, Philip Tagari, Daphne Koller, Christopher Probert. Machine learning enables prediction of ADC targets from whole slide H&E images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular s); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl) nr 6177.

oncology
A Deep Learning Approach for NeoAG-Specific Prediction Considering Both HLA-Peptide Binding and Immunogenicity: Finding Neoantigens to Making T-Cell Products More Personal

Xian-Xian Liu,Gloria Li,Wei Luo,Juntao Gao,Simon Fong

DOI: https://doi.org/10.1101/2021.12.22.473942

2021-01-01

Abstract:Background An emerging type of cancer treatment, known as cell immunotherapy, is gaining popularity over chemotherapy or other radiation therapy that causes mass destruction to our body. One favourable approach in cell immunotherapy is the use of neoantigens as targets that help our body immune system identify the cancer cells from healthy cells. Neoantigens, which are non-autologous proteins with individual specificity, are generated by non-synonymous mutations in the tumor cell genome. Owing to its strong immunogenicity and lack of expression in normal tissues, it is now an important target for tumor immunotherapy. Neoantigens are some form of special protein fragments excreted as a by-product on the surface of cancer cells during the DNA mutation at the tumour. In cancer immunotherapies, certain neoantigens which exist only on cancer cells elicit our white blood cells (body’s defender, anti-cancer T-cell) responses that fight the cancer cells while leaving healthy cells alone. Personalized cancer vaccines therefore can be designed de novo for each individual patient, when the specific neoantigens are found to be relevant to his/her tumour. The vaccine which is usually coded in synthetic long peptides, RNA or DNA representing the neoantigens trigger an immune response in the body to destroy the cancer cells (tumour). The specific neoantigens can be found by a complex process of biopsy and genome sequencing. Alternatively, modern technologies nowadays tap on AI to predict the right neoantigen candidates using algorithms. However, determining the binding and non-binding of neoantigens on T-cell receptors (TCR) is a challenging computational task due to its very large search space.Objective To enhance the efficiency and accuracy of traditional deep learning tools, for serving the same purpose of finding potential responsiveness to immunotherapy through correctly predicted neoantigens. It is known that deep learning is possible to explore which novel neoantigens bind to T-cell receptors and which ones don’t. The exploration may be technically expensive and time-consuming since deep learning is an inherently computational method. one can use putative neoantigen peptide sequences to guide personalized cancer vaccines design.Methods These models all proceed through complex feature engineering, including feature extraction, dimension reduction and so on. In this study, we derived 4 features to facilitate prediction and classification of 4 HLA-peptide binding namely AAC and DC from the global sequence, and the LAAC and LDC from the local sequence information. Based on the patterns of sequence formation, a nested structure of bidirectional long-short term memory neural network called local information module is used to extract context-based features around every residue. Another bilstm network layer called global information module is introduced above local information module layer to integrate context-based features of all residues in the same HLA-peptide binding chain, thereby involving inter-residue relationships in the training process. introducedResults Finally, a more effective model is obtained by fusing the above two modules and 4 features matric, the method performs significantly better than previous prediction schemes, whose overall r-square increased to 0.0125 and 0.1064 on train and increased to 0.0782 and 0.2926 on test datasets. The RMSE for our proposed models trained decreased to approximately 0.0745 and 1.1034, respectively, and decreased to 0.6712 and 1.6506 on test dataset.Conclusion Our work has been actively refining a machine-learning model to improve neoantigen identification and predictions with the determinants for Neoantigen identification. The final experimental results show that our method is more effective than existing methods for predicting peptide types, which can help laboratory researchers to identify the type of novel HLA-peptide binding.### Competing Interest StatementThe authors have declared no competing interest.
PSSMHCpan: a novel PSSM-based software for predicting class I peptide-HLA binding affinity

Geng Liu,Dongli Li,Zhang Li,Si Qiu,Wenhui Li,Cheng-Chi Chao,Naibo Yang,Handong Li,Zhen Cheng,Xin Song,Le Cheng,Xiuqing Zhang,Jian Wang,Huanming Yang,Kun Ma,Yong Hou,Bo Li

DOI: https://doi.org/10.1093/gigascience/gix017

IF: 7.658

2017-05-01

GigaScience

Abstract:Predicting peptide binding affinity with human leukocyte antigen (HLA) is a crucial step in developing powerful antitumor vaccine for cancer immunotherapy. Currently available methods work quite well in predicting peptide binding affinity with HLA alleles such as HLA-A*0201, HLA-A*0101, and HLA-B*0702 in terms of sensitivity and specificity. However, quite a few types of HLA alleles that are present in the majority of human populations including HLA-A*0202, HLA-A*0203, HLA-A*6802, HLA-B*5101, HLA-B*5301, HLA-B*5401, and HLA-B*5701 still cannot be predicted with satisfactory accuracy using currently available methods. Furthermore, currently the most popularly used methods for predicting peptide binding affinity are inefficient in identifying neoantigens from a large quantity of whole genome and transcriptome sequencing data. Here we present a Position Specific Scoring Matrix (PSSM)-based software called PSSMHCpan to accurately and efficiently predict peptide binding affinity with a broad coverage of HLA class I alleles. We evaluated the performance of PSSMHCpan by analyzing 10-fold cross-validation on a training database containing 87 HLA alleles and obtained an average area under receiver operating characteristic curve (AUC) of 0.94 and accuracy (ACC) of 0.85. In an independent dataset (Peptide Database of Cancer Immunity) evaluation, PSSMHCpan is substantially better than the popularly used NetMHC-4.0, NetMHCpan-3.0, PickPocket, Nebula, and SMM with a sensitivity of 0.90, as compared to 0.74, 0.81, 0.77, 0.24, and 0.79. In addition, PSSMHCpan is more than 197 times faster than NetMHC-4.0, NetMHCpan-3.0, PickPocket, sNebula, and SMM when predicting neoantigens from 661 263 peptides from a breast tumor sample. Finally, we built a neoantigen prediction pipeline and identified 117 017 neoantigens from 467 cancer samples of various cancers from TCGA. PSSMHCpan is superior to the currently available methods in predicting peptide binding affinity with a broad coverage of HLA class I alleles.
Improved prediction of HLA antigen presentation hotspots: Applications for immunogenicity risk assessment of therapeutic proteins

Anders Steenholdt Attermann,Carolina Barra,Birkir Reynisson,Heidi Schiøler Schultz,Ulrike Leurs,Kasper Lamberth,Morten Nielsen

DOI: https://doi.org/10.1111/imm.13274

Immunology

Abstract:Immunogenicity risk assessment is a critical element in protein drug development. Currently, the risk assessment is most often performed using MHC-associated peptide proteomics (MAPPs) and/or T-cell activation assays. However, this is a highly costly procedure that encompasses limited sensitivity imposed by sample sizes, the MHC repertoire of the tested donor cohort and the experimental procedures applied. Recent work has suggested that these techniques could be complemented by accurate, high-throughput and cost-effective prediction of in silico models. However, this work covered a very limited set of therapeutic proteins and eluted ligand (EL) data. Here, we resolved these limitations by showcasing, in a broader setting, the versatility of in silico models for assessment of protein drug immunogenicity. A method for prediction of MHC class II antigen presentation was developed on the hereto largest available mass spectrometry (MS) HLA-DR EL data set. Using independent test sets, the performance of the method for prediction of HLA-DR antigen presentation hotspots was benchmarked. In particular, the method was showcased on a set of protein sequences including four therapeutic proteins and demonstrated to accurately predict the experimental MS hotspot regions at a significantly lower false-positive rate compared with other methods. This gain in performance was particularly pronounced when compared to the NetMHCIIpan-3.2 method trained on binding affinity data. These results suggest that in silico methods trained on MS HLA EL data can effectively and accurately be used to complement MAPPs assays for the risk assessment of protein drugs.

Abstract State Machines for Data-Parallel Computing

HLAIImaster: a deep learning method with adaptive domain knowledge predicts HLA II neoepitope immunogenic responses

LRMAHpan: a novel tool for multi-allelic HLA presentation prediction using Resnet-based and LSTM-based neural networks

HLAIIPred: Cross-Attention Mechanism for Modeling the Interaction of HLA Class II Molecules with Peptides

Deep Learning-Enhanced MHC-II Presentation Prediction and Peptidome Deconvolution

DeepHLApan: A Deep Learning Approach for Neoantigen Prediction Considering Both HLA-Peptide Binding and Immunogenicity

HLA class I binding prediction via convolutional neural networks

MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism

A novel HLA Class II presentation prediction algorithm deciphers immunogenic CD4 epitopes specific to KRAS G12C

Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-A2-peptide immunogenicity

DeepHLApan: A Deep Learning Approach for High-Confidence Neoantigen Prediction

Improvement of Neoantigen Identification Through Convolution Neural Network

TripHLApan: predicting HLA molecules binding peptides based on triple coding matrix and transfer learning

High-Throughput Prediction of MHC Class I and II Neoantigens with MHCnuggets

Machine learning application to predict binding affinity between peptide containing non-canonical amino acids and HLA0201

Reorganization of the Right Arcuate Fasciculus Following Left Arcuate Fasciculus Resection in Children With Intractable Epilepsy

TLimmuno2: predicting MHC class II antigen immunogenicity through transfer learning

Abstract 6177: Machine learning enables prediction of ADC targets from whole slide H&E images

A Deep Learning Approach for NeoAG-Specific Prediction Considering Both HLA-Peptide Binding and Immunogenicity: Finding Neoantigens to Making T-Cell Products More Personal

PSSMHCpan: a novel PSSM-based software for predicting class I peptide-HLA binding affinity

Improved prediction of HLA antigen presentation hotspots: Applications for immunogenicity risk assessment of therapeutic proteins