Molecular Similarity in Predictive Toxicology with a Focus on the q-RASAR Technique

Arkaprava Banerjee,Kunal Roy
DOI: https://doi.org/10.1007/978-1-0716-4003-6_2
Abstract:The concept of similarity is an important aspect in various in silico-based prediction approaches. Most of these approaches follow the basic similarity property principle that states that two or more compounds having a high level of similarity are expected to exert similar biological activity or physicochemical property. Although in some cases this principle fails to predict the biological activity or property efficiently for certain compounds, it is applicable to most of the compounds in a given dataset. With the emerging need to efficiently fill data gaps in the regulatory context, Read-Across (RA), a similarity-based approach, has gained popularity, since this is not a statistical approach like QSAR, which requires a sizeable amount of data points to train a meaningful model. The basic idea behind Read-Across is the identification of the close source neighbors, and based on the similarity considerations, predictions are made for the query compound. Although RA is originally an unsupervised prediction method, recent efforts for quantitative Read-Across (qRA) have introduced supervised similarity-based weightage for quantitative predictions. RA is a useful tool in predictive toxicology, but one of its important drawbacks is the lack of interpretability of the features (especially for q-RA) used to generate the Read-Across-based predictions. To bridge this gap, a novel quantitative Read-Across Structure-Activity Relationship (q-RASAR) approach has recently been proposed, which combines the concepts of QSAR and Read-Across, generating statistically reliable and predictive models using similarity and error-based descriptors. The q-RASAR models are simple and interpretable and can be efficiently used to identify not only the essential features but also the nature of the source and query compounds. In this chapter, we have discussed the concepts and various studies on RA, q-RA, and q-RASAR along with some of the tools available from different research groups.
What problem does this paper attempt to address?