Graph signal processing based nonlinear QSAR/QSPR model learning for compounds

Xiaoying Song,Gaoya Wen,Li Chai
DOI: https://doi.org/10.1016/j.bspc.2024.106011
IF: 5.1
2024-05-01
Biomedical Signal Processing and Control
Abstract:The learning of QSAR/QSPR models is crucial for predicting the physicochemical properties/biological activities of compounds. For compounds with the same/similar structure, there is a high probability that the molecular descriptors describing the molecular structure undergo degeneracy such that it is impossible to distinguish between these compounds, which in turn leads to modeling failure. In this paper, we construct a two-layer graph structure to design new feature descriptors with high discriminability, and propose a graph signal processing (GSP)-based method for nonlinear QSAR/QSPR model learning for compounds. Considering the molecular descriptors computed based on the molecular graph as a high-dimensional signal, we construct on top of the molecular graph a higher-level graph structure – a compound graph – to describe the similarity between different compounds. We use local complex network metrics to quantify the local topological information on each vertex (compound) in the compound graph, which can naturally be used as new discriminative feature descriptors, based on which we learn two types of nonlinear QSAR/QSPR models. Three datasets with the same/similar molecular graphs are used to learn to model their biological activity, entropy and octanol–water partition coefficient, respectively, and the experimental results well validate the effectiveness of the proposed method.
engineering, biomedical
What problem does this paper attempt to address?