Predicting HLA class II antigen presentation through integrated deep learning

Binbin Chen,Michael S Khodadoust,Niclas Olsson,Lisa E Wagar,Ethan Fast,Chih Long Liu,Yagmur Muftuoglu,Brian J Sworder,Maximilian Diehn,Ronald Levy,Mark M Davis,Joshua E Elias,Russ B Altman,Ash A Alizadeh
DOI: https://doi.org/10.1038/s41587-019-0280-2
Abstract:Accurate prediction of antigen presentation by human leukocyte antigen (HLA) class II molecules would be valuable for vaccine development and cancer immunotherapies. Current computational methods trained on in vitro binding data are limited by insufficient training data and algorithmic constraints. Here we describe MARIA (major histocompatibility complex analysis with recurrent integrated architecture; https://maria.stanford.edu/ ), a multimodal recurrent neural network for predicting the likelihood of antigen presentation from a gene of interest in the context of specific HLA class II alleles. In addition to in vitro binding measurements, MARIA is trained on peptide HLA ligand sequences identified by mass spectrometry, expression levels of antigen genes and protease cleavage signatures. Because it leverages these diverse training data and our improved machine learning framework, MARIA (area under the curve = 0.89-0.92) outperformed existing methods in validation datasets. Across independent cancer neoantigen studies, peptides with high MARIA scores are more likely to elicit strong CD4+ T cell responses. MARIA allows identification of immunogenic epitopes in diverse cancers and autoimmune disease.
What problem does this paper attempt to address?