Small dataset solves big problem: An outlier-insensitive binary classifier for inhibitory potency prediction

Teng Zhou,Haowen Dou,Jie Tan,Youyi Song,Fei Wang,Jiaqi Wang
DOI: https://doi.org/10.1016/j.knosys.2022.109242
IF: 8.139
2022-09-01
Knowledge-Based Systems
Abstract:Nicotinamide phosphoribosyltransferase (NAMPT) inhibitors show importance in cancer disease treatment while selecting compounds from a library according to inhibitory potency for further experiments is considered to be the main way for drug discovery. Meanwhile, computational methods have been widely used to accelerate the process of drug discovery. Hence, we propose a machine learning model that only needs to be trained on an extremely small dataset to predict the inhibition constant (Ki) and half maximal inhibitory concentration (IC50) for a compound. The key idea is to directly rank compounds according to inhibitory potency by solving a simpler binary classification problem since we only need the relative ranks of the inhibitors for drug screening. To this end, we develop an adaptive data augmentation method to consider and effectively capture the relative information between compounds in the original dataset. However, outliers in small samples can always be tricky to detect, and may severely affect the learned distribution of the classifier. In this regard, we propose an outlier-insensitive classifier with an effective feature selection module for the one-to-all classification task. Extensive experiments show that our model gains high and reliable accuracy in ranking compounds according to inhibitory potency. The current results demonstrate that the proposed model achieves reliability in prioritizing chemicals for experiment research and analysis through a ligand-based in silico approach.
computer science, artificial intelligence
What problem does this paper attempt to address?