Triplet encoded sequence based membrane protein classification using BiLSTM

S. Gomathi,K. Nithish Ram,N. Ani Brown Mary
DOI: https://doi.org/10.1007/s11042-024-19010-4
IF: 2.577
2024-04-13
Multimedia Tools and Applications
Abstract:Membrane proteins provide a significant part in cellular activities. The role of membrane proteins is inevitable in drug interactions and in all living organisms. Membrane protein classification is used to identify the relationships between proteins. With the help of amino acid composition, proteins get classified. A novel protein classification scheme is proposed using Tri-code Embedding vector. This proposed method forms triplet subgroups which are assigned with unique code words. Then a triplet subgroup is formed from the amino acid subgroup which is provided as input to the Bidirectional Long Short-Term Memory (BiLSTM) and SoftMax layer for classification. Two data sets are utilized and classified, with 7582 membrane proteins and 4684 membrane proteins. The results are investigated applying the self-consistency test, the Mathew's correlation coefficient and the independent data set. Moreover, the proposed method shows its improvement in protein classification process in terms of accuracy, specificity, sensitivity, precision, recall and fmeasure. Thus, the proposed scheme provides an effective protein classification scheme that incorporates the optimistic features of deep learning. The results depict that overall accuracy obtained for data set1 is 99.48% and for data set2 is 99.87%. The proposed method achieves the highest overall classification accuracy with minimum execution time when compared to the other methods.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?