Advancements in Molecular Property Prediction: A Survey of Single and Multimodal Approaches

Tanya Liyaqat,Tanvir Ahmad,Chandni Saxena
2024-08-22
Abstract:Molecular Property Prediction (MPP) plays a pivotal role across diverse domains, spanning drug discovery, material science, and environmental chemistry. Fueled by the exponential growth of chemical data and the evolution of artificial intelligence, recent years have witnessed remarkable strides in MPP. However, the multifaceted nature of molecular data, such as molecular structures, SMILES notation, and molecular images, continues to pose a fundamental challenge in its effective representation. To address this, representation learning techniques are instrumental as they acquire informative and interpretable representations of molecular data. This article explores recent AI/-based approaches in MPP, focusing on both single and multiple modality representation techniques. It provides an overview of various molecule representations and encoding schemes, categorizes MPP methods by their use of modalities, and outlines datasets and tools available for feature generation. The article also analyzes the performance of recent methods and suggests future research directions to advance the field of MPP.
Machine Learning,Materials Science,Chemical Physics,Biomolecules
What problem does this paper attempt to address?
The paper aims to address key challenges in Molecular Property Prediction (MPP) and advance research progress in this field. Specifically, the paper attempts to tackle the following aspects: 1. **Effective Representation of Multimodal Molecular Data**: Molecular data has multiple characteristics, including molecular structure, SMILES representation, and molecular images. These different types of representations pose challenges for effective representation. The paper explores how to obtain information-rich and interpretable molecular data representations through representation learning techniques. 2. **Comprehensive Analysis of Unimodal and Multimodal Approaches**: The paper covers not only unimodal approaches (e.g., using only SMILES or graphical structures) but also discusses techniques that combine multiple modalities (e.g., SMILES + graphical structure + images). By comparing unimodal and multimodal approaches, the paper provides detailed insights into various techniques. 3. **Application of Latest AI Technologies**: In recent years, artificial intelligence, especially deep learning, has made significant progress in molecular property prediction. The paper reviews the application of these new technologies in drug discovery and other related fields and analyzes their performance. 4. **Future Research Directions**: Finally, the paper points out the challenges in current research and proposes future research directions to promote the development of the MPP field. Overall, this paper aims to help researchers better understand existing technologies and potential areas for improvement through a comprehensive review of the molecular property prediction field, thereby accelerating the process of new drug development.