Learning with uncertainty to accelerate the discovery of histone lysine-specific demethylase 1A (KDM1A/LSD1) inhibitors

Dong Wang,Zhenxing Wu,Chao Shen,Lingjie Bao,Hao Luo,Zhe Wang,Hucheng Yao,De-Xin Kong,Cheng Luo,Tingjun Hou
DOI: https://doi.org/10.1093/bib/bbac592
IF: 9.5
2023-01-01
Briefings in Bioinformatics
Abstract:Machine learning including modern deep learning models has been extensively used in drug design and screening. However, reliable prediction of molecular properties is still challenging when exploring out-of-domain regimes, even for deep neural networks. Therefore, it is important to understand the uncertainty of model predictions, especially when the predictions are used to guide further experiments. In this study, we explored the utility and effectiveness of evidential uncertainty in compound screening. The evidential Graphormer model was proposed for uncertainty-guided discovery of KDM1A/LSD1 inhibitors. The benchmarking results illustrated that (i) Graphormer exhibited comparative predictive power to state-of-the-art models, and (ii) evidential regression enabled well-ranked uncertainty estimates and calibrated predictions. Subsequently, we leveraged time-splitting on the curated KDM1A/LSD1 dataset to simulate out-of-distribution predictions. The retrospective virtual screening showed that the evidential uncertainties helped reduce false positives among the top-acquired compounds and thus enabled higher experimental validation rates. The trained model was then used to virtually screen an independent in-house compound set. The top 50 compounds ranked by two different ranking strategies were experimentally validated, respectively. In general, our study highlighted the importance to understand the uncertainty in prediction, which can be recognized as an interpretable dimension to model predictions.
What problem does this paper attempt to address?