UmamiPreDL: Deep learning model for umami taste prediction of peptides using BERT and CNN

Arun Pandiyan Indiran,Humaira Fatima,Sampriti Chattopadhyay,Sureshkumar Ramadoss,Yashwanth Radhakrishnan
DOI: https://doi.org/10.1016/j.compbiolchem.2024.108116
Abstract:Taste is crucial in driving food choice and preference. Umami is one of the basic tastes defined by characteristic deliciousness and mouthfulness that it imparts to foods. Identification of ingredients to enhance umami taste is of significant value to food industry. Various models have been shown to predict umami taste using feature encodings derived from traditional molecular descriptors such as amphiphilic pseudo-amino acid composition, dipeptide composition, and composition-transition-distribution. Highest reported accuracy of 90.5 % was recently achieved through novel model architecture. Here, we propose use of biological sequence transformers such as ProtBert and ESM2, trained on the Uniref databases, as the feature encoders block. With combination of 2 encoders and 2 classifiers, 4 model architectures were developed. Among the 4 models, ProtBert-CNN model outperformed other models with accuracy of 95 % on 5-fold cross validation data and 94 % on independent data.
What problem does this paper attempt to address?