PrMFTP: Multi-functional therapeutic peptides prediction based on multi-head self-attention mechanism and class weight optimization
Wenhui Yan,Wending Tang,Lihua Wang,Yannan Bin,Junfeng Xia
DOI: https://doi.org/10.1371/journal.pcbi.1010511
2022-09-13
PLoS Computational Biology
Abstract:Prediction of therapeutic peptide is a significant step for the discovery of promising therapeutic drugs. Most of the existing studies have focused on the mono-functional therapeutic peptide prediction. However, the number of multi-functional therapeutic peptides (MFTP) is growing rapidly, which requires new computational schemes to be proposed to facilitate MFTP discovery. In this study, based on multi-head self-attention mechanism and class weight optimization algorithm, we propose a novel model called PrMFTP for MFTP prediction. PrMFTP exploits multi-scale convolutional neural network, bi-directional long short-term memory, and multi-head self-attention mechanisms to fully extract and learn informative features of peptide sequence to predict MFTP. In addition, we design a class weight optimization scheme to address the problem of label imbalanced data. Comprehensive evaluation demonstrate that PrMFTP is superior to other state-of-the-art computational methods for predicting MFTP. We provide a user-friendly web server of PrMFTP, which is available at http://bioinfo.ahu.edu.cn/PrMFTP. Therapeutic peptides possess a wide range of biological properties, including anti-cancer, anti-hypertensive, anti-viral, and so forth. This is a prerequisite to understanding functional therapeutic peptides and ultimately designing these peptides for drug discovery and development. With the number of multi-functional therapeutic peptides (MFTP) growing, predicting these peptides is an urgent problem in the development of novel peptide-based therapeutics. We develope PrMFTP, an approach for MFTP prediction based on multi-label classification. Our method uses a deep neural network and multi-head self-attention that are able to optimize the features from the peptide sequences. Furthermore, for the imbalance problem in the multi-label dataset, a novel class weight optimization scheme is used to improve the performance of PrMFTP. We evaluate our approach using example-based measures and compare it with the top-performing MLBP method as well as the SOTA multi-functional peptides prediction approaches, demonstrating the improvement of PrMFTP over the existing methods.
biochemical research methods,mathematical & computational biology