TOP: Towards Better Toxicity Prediction by Deep Molecular Representation Learning

Yuzhong Peng,Ziqiao Zhang,Qizhi Jiang,Jihong Guan,Shuigeng Zhou
DOI: https://doi.org/10.1109/bibm47256.2019.8983340
2019-01-01
Abstract:At the early stages of the drug discovery, molecule toxicity prediction is crucial to excluding drug candidates that are likely to fail in clinical trials. In this paper, we presented a novel molecular representation method and developed a corresponding deep learning-based framework called TOP (the abbreviation of TOxicity Prediction). TOP integrated a serial special data processing methods, a bidirectional gated recurrent unit-based RNN (BiGRU) and a fully connected neural network for end-to-end molecular representation learning and chemical toxicity prediction. TOP can automatically learn a mixed molecular representation from not only SMILES contextual information that describes the molecule structure, but also physiochemical properties. Therefore, TOP can overcome the drawbacks of existing methods that use either of them, thus greatly promotes toxicity prediction. We conducted extensive experiments over 14 classic toxicity prediction tasks on three different benchmark datasets, including balanced and imbalanced ones. The results show that, with the help of the novel molecular representation method, TOP significantly outperforms not only three baseline machine learning methods, but also five state-of-the-art methods.
What problem does this paper attempt to address?