Molecular Similarity: Theory, Applications, and Perspectives

Ramon Miranda-Quintana,Kenneth Lopez Perez,Lexin Chen,Juan Avellaneda Tamayo,Edgar Lopez Lopez,K. Euridice Juarez Mercado,Jose Luis Medina Franco
DOI: https://doi.org/10.26434/chemrxiv-2023-cs3wb
2023-11-24
Abstract:Molecular similarity pervades much of our understanding and rationalization of chemistry. This has become particularly evident in the current data-intensive era of chemical research, with similarity measures serving as the backbone of many Machine Learning (ML) supervised and unsupervised procedures. Here, we present a discussion on the role of molecular similarity in drug design, chemical space exploration, chemical “art” generation, molecular representations, and many more. We also discuss more recent topics in molecular similarity, like the ability to efficiently compare large molecular libraries.
Chemistry
What problem does this paper attempt to address?
The paper mainly discusses the theory, applications, and prospects of molecular similarity in various fields such as chemistry, drug design, chemical space exploration, and chemical art generation. The author discusses how molecular similarity serves as the foundation of machine learning algorithms, especially in drug discovery, such as molecular screening and chemical space exploration. In addition, the paper touches on new topics such as the efficient comparison capabilities of large-scale molecular libraries. The paper points out that similarity is crucial for human cognition and essential for scientific classification and understanding of chemical phenomena. The quantification of molecular similarity involves the selection of molecular descriptors and similarity measurement methods, which can be based on different chemical properties such as functional groups, structures, or bioactivity. Molecular fingerprints, linear representations (such as SMILES and InChI), and different molecular attributes are tools for characterizing molecules and measuring similarity. The paper also emphasizes the relativity and subjectivity of similarity, which depend on specific comparison objectives, time, and environment. The article concludes by mentioning various scenarios of practical applications of molecular similarity, including drug discovery, database mining, and biomedical research, which involve structure-property relationship analysis, similarity search, and feature analysis of compound libraries. Through these applications, scientists are able to predict the properties of compounds, optimize drug design, and find new insights in chemistry and biology research.