A Framework for Accurate Prediction of Plastic-Degrading Enzymes using Convolutional Neural Networks

Soharth Hasnat,Fariah Anjum Shifa,Shabab Murshed,Tanveer Ahmed Rumee,MST Murshida Mahbub
DOI: https://doi.org/10.1101/2024.10.20.619257
2024-10-23
Abstract:The growing accumulation of plastic waste presents a significant environmental challenge, necessitating innovative approaches to mitigate its impact. Enzymatic degradation has emerged as a promising solution for addressing plastic pollution. However, the isolation and characterization of plastic-degrading enzymes (PDEs) through laboratory experiments are costly, time-consuming, and often complicated by nonculturable microorganisms. Consequently, accurate in silico identification of PDEs is desirable to explore the diversity of natural enzymes and harness their potential for combating plastic pollution. This study introduces a novel feature extraction strategy for identifying plastic-degrading enzymes, incorporating Autocorrelation (AAutoCor), Composition of k-spaced Amino Acid Pairs (KSAP), Dipeptide Deviation from Expected Mean (DDE), Composition/Transition/Distribution (C/T/D), Conjoint Triad, and Secondary Structure. A combination of ANOVA and XGBoost, feature selection methods, was applied to optimize the feature dimensions for improved performance. Seven supervised machine learning models were employed to evaluate the dataset: Convolutional Neural Network, Random Forest Classifier, Feedforward Neural Network, Logistic Regression, Naive Bayes Classifier, K-nearest Neighbor, and XGBoost Classifier. Among these models, the CNN model demonstrated the best performance, achieving an accuracy of 0.96, an F1 score of 0.80, and an ROC-AUC score of 0.96. These findings underscore the potential of the proposed system as an accurate predictor of plastic-degrading enzymes from environmental sequences. This approach significantly enhances efforts to develop sustainable solutions to plastic waste by accelerating the discovery of novel PDEs.
Bioinformatics
What problem does this paper attempt to address?