Optimization of a Multi-Feature AI Ensemble and Voting System for MAPKAPK2 Inhibitor Discovery

Hayden Chen
DOI: https://doi.org/10.1101/2024.11.26.625342
2024-12-02
Abstract:The identification of an effective inhibitor is an essential starting point in drug discovery. Unfortunately, many issues arise with conventional high-throughput screening methods. Thus, new strategies are needed to filter through large compound screening libraries to create target-focused, smaller libraries. Effective computational methods in this respect have emerged in the past decade or so; among these methods is machine learning. Herein, we explore an ensemble Deep Learning model trained on MAPKAPK2 bioactivity data. This ensemble ML model consists of ten individual models trained on different features, each optimized for MAPKAPK2 inhibitor identification. Voting systems were established alongside the model. Using these voting systems, the ensemble model achieved an accuracy score of 0.969 and precision score of 0.964 on a testing set, in addition to reporting a false positive rate of 0.014 on an inactive compound set. The reported metrics indicate an effective initial step for novel MAPKAPK2 inhibitor identification and subsequent drug development, with applicability to other kinase targets.
Bioinformatics
What problem does this paper attempt to address?