Conformal Selection for Efficient and Accurate Compound Screening in Drug Discovery

Tian Bai,Peng Tang,Yuting Xu,Vladimir Svetnik,Abbas Khalili,Xiang Yu,Archer Yang
DOI: https://doi.org/10.26434/chemrxiv-2024-pf3ph
2024-11-01
Abstract:In drug discovery, the reliability of compound screening based on manual assessments is compromised by potential bias, while existing methods lack robust risk control measures. To address these challenges, we introduced conformal selection as an enhanced approach to optimize the compound screening process with balanced risks and benefits. Leveraging conformal inference, our approach constructs p-values for each candidate molecule to quantify statistical evidence for selection. The final selection of molecules is determined by comparing these p-values against thresholds derived from multiple testing principles. Our approach offers rigorous control over the false discovery rate, ensuring validity independent of dataset size and requiring minimal assumptions. By avoiding the estimation of prediction errors required in previous approaches, our method achieves higher accuracy (power), thereby improving the ability to identify promising candidates. Furthermore, our method demonstrates superior computational efficiency. We validate these advantages through numerical simulations on real-world datasets.
Chemistry
What problem does this paper attempt to address?