Classification and Design of HIV-1 Integrase Inhibitors Based on Machine Learning.

Junlin Zhou,Juan Hao,Lianxin Peng,Huaichuan Duan,Qing Luo,Hailian Yan,Hua Wan,Yichen Hu,Li Liang,Zhenjian Xie,Wei Liu,Gang Zhao,Jianping Hu
DOI: https://doi.org/10.1155/2021/5559338
IF: 2.809
2021-01-01
Computational and Mathematical Methods in Medicine
Abstract:A key enzyme in human immunodeficiency virus type 1 (HIV-1) life cycle, integrase (IN) aids the integration of viral DNA into the host DNA, which has become an ideal target for the development of anti-HIV drugs. A total of 1785 potential HIV-1 IN inhibitors were collected from the databases of ChEMBL, Binding Database, DrugBank, and PubMed, as well as from 40 references. The database was divided into the training set and test set by random sampling. By exploring the correlation between molecular descriptors and inhibitory activity, it is found that the classification and specific activity data of inhibitors can be more accurately predicted by the combination of molecular descriptors and molecular fingerprints. The calculation of molecular fingerprint descriptor provides the additional substructure information to improve the prediction ability. Based on the training set, two machine learning methods, the recursive partition (RP) and naive Bayes (NB) models, were used to build the classifiers of HIV-1 IN inhibitors. Through the test set verification, the RP technique accurately predicted 82.5% inhibitors and 86.3% noninhibitors. The NB model predicted 88.3% inhibitors and 87.2% noninhibitors with correlation coefficient of 85.2%. The results show that the prediction performance of NB model is slightly better than that of RP, and the key molecular segments are also obtained. Additionally, CoMFA and CoMSIA models with good activity prediction ability both were constructed by exploring the structure-activity relationship, which is helpful for the design and optimization of HIV-1 IN inhibitors.
What problem does this paper attempt to address?