Identifying B-cell epitopes using AlphaFold2 predicted structures and pretrained language model

Yuansong Zeng,Zhuoyi Wei,Qianmu Yuan,Sheng Chen,Weijiang Yu,Yutong Lu,Jianzhao Gao,Yuedong Yang
DOI: https://doi.org/10.1093/bioinformatics/btad187
IF: 5.8
2023-04-03
Bioinformatics
Abstract:Motivation: Identifying the B-cell epitopes is an essential step for guiding rational vaccine development and immunotherapies. Since experimental approaches are expensive and time-consuming, many computational methods have been designed to assist B-cell epitope prediction. However, existing sequence-based methods have limited performance since they only use contextual features of the sequential neighbors while neglecting structural information. Results: Based on the recent breakthrough of AlphaFold2 in protein structure prediction, we propose GraphBepi, a novel graph-based model for accurate B-cell epitope prediction. For one protein, the predicted structure from AlphaFold2 is used to construct the protein graph, where the nodes/residues are encoded by ESM-2 learning representations. The graph is input into the edge-enhanced deep graph neural network (EGNN) to capture the spatial information in the predicted 3D structures. In parallel, a bidirectional long short-term memory neural networks (BiLSTM) are employed to capture long-range dependencies in the sequence. The learned low-dimensional representations by EGNN and BiLSTM are then combined into a multilayer perceptron for predicting B-cell epitopes. Through comprehensive tests on the curated epitope dataset, GraphBepi was shown to outperform the state-of-the-art methods by more than 5.5% and 44.0% in terms of AUC and AUPR, respectively. A web server is freely available at http://bio-web1.nscc-gz.cn/app/graphbepi. Availability and implementation: The datasets, pre-computed features, source codes, and the trained model are available at https://github.com/biomed-AI/GraphBepi.
What problem does this paper attempt to address?