Machine Learning Models for the Classification of CK2 Natural Products Inhibitors with Molecular Fingerprint Descriptors

Yuting Liu,Mengzhou Bi,Xuewen Zhang,Na Zhang,Guohui Sun,Yue Zhou,Lijiao Zhao,Rugang Zhong
DOI: https://doi.org/10.3390/pr9112074
IF: 3.5
2021-11-19
Processes
Abstract:Casein kinase 2 (CK2) is considered an important target for anti-cancer drugs. Given the structural diversity and broad spectrum of pharmaceutical activities of natural products, numerous studies have been performed to prove them as valuable sources of drugs. However, there has been little study relevant to identifying structural factors responsible for their inhibitory activity against CK2 with machine learning methods. In this study, classification studies were conducted on 115 natural products as CK2 inhibitors. Seven machine learning methods along with six molecular fingerprints were employed to develop qualitative classification models. The performances of all models were evaluated by cross-validation and test set. By taking predictive accuracy(CA), the area under receiver operating characteristic (AUC), and (MCC)as three performance indicators, the optimal models with high reliability and predictive ability were obtained, including the Extended Fingerprint-Logistic Regression model (CA = 0.859, AUC = 0.826, MCC = 0.520) for training test andPubChem fingerprint along with the artificial neural model (CA = 0.826, AUC = 0.933, MCC = 0.628) for test set. Meanwhile, the privileged substructures responsible for their inhibitory activity against CK2 were also identified through a combination of frequency analysis and information gain. The results are expected to provide useful information for the further utilization of natural products and the discovery of novel CK2 inhibitors.
engineering, chemical
What problem does this paper attempt to address?