Conformer-based Multiple-Instance Learning for Predicting Biodegradability Classification

Qi Yao Yim
DOI: https://doi.org/10.26434/chemrxiv-2024-wbdbn-v2
2024-08-27
Abstract:In-silico methods are increasingly becoming reliable tools to replicate and extend from experimental findings of chemical biodegradability. Information derived from quantitative activity-structure relationships (QSARs) have the potential to have rules extracted that can aid the understanding of biodegradation. Using semi-empirical quantum chemical calculations, the use of a conformer-based augmentation approach, along with dimensionality reduction methods, was studied in the context of achieving improved model accuracy and applicability. This work highlights molecular features, from graph-based features, 3-dimensional structural descriptors, to direct graph-based learning methods, that can be used to distinguish readily biodegradable compounds, and the role of unsupervised pre-processing in refining the training set and choice of features.
Chemistry
What problem does this paper attempt to address?