QUBO-inspired Molecular Fingerprint for Chemical Property Prediction

Koichiro Yawata,Yoshihiro Osakabe,Takuya Okuyama,Akinori Asahara
DOI: https://doi.org/10.1109/BigData55660.2022.10020236
2023-03-17
Abstract:Molecular fingerprints are widely used for predicting chemical properties, and selecting appropriate fingerprints is important. We generate new fingerprints based on the assumption that a performance of prediction using a more effective fingerprint is better. We generate effective interaction fingerprints that are the product of multiple base fingerprints. It is difficult to evaluate all combinations of interaction fingerprints because of computational limitations. Against this problem, we transform a problem of searching more effective interaction fingerprints into a quadratic unconstrained binary optimization problem. In this study, we found effective interaction fingerprints using QM9 dataset.
Machine Learning,Logic in Computer Science,Biomolecules
What problem does this paper attempt to address?
The main goal of this paper is to develop a new method for generating molecular fingerprints based on the Quadratic Unconstrained Binary Optimization (QUBO) problem, in order to improve the accuracy of chemical property predictions. Specifically, the authors attempt to address the following issues: 1. **Effective Generation of Molecular Fingerprints**: Existing molecular fingerprints may not adequately capture the complex determinants of chemical properties. By generating new interactive molecular fingerprints (i.e., the product of multiple basic fingerprints), researchers aim to find effective fingerprints that can more efficiently describe the relationship between molecular structure and chemical properties. 2. **Overcoming Combinatorial Optimization Challenges**: Evaluating all possible combinations of interactive molecular fingerprints is a computationally challenging problem due to the vast number of combinations involved. To tackle this challenge, the authors transform the problem of finding more effective interactive molecular fingerprints into a QUBO problem, thus enabling the use of techniques such as annealing machines to solve it within a reasonable time frame. 3. **Application of QUBO Decision Trees**: The paper introduces a new method called "QUBO Decision Trees," which combines the advantages of decision trees and QUBO problems to identify molecular fingerprints that can reduce prediction errors. This method can handle interactions between features and optimizes the selection of fingerprints by minimizing the Squared Weighted Mean Squared Error (SWMSE). 4. **Experimental Validation**: Researchers conducted experiments using the QM9 dataset to verify whether the proposed QUBO Decision Tree method can discover effective interactive molecular fingerprints that significantly reduce the error in predicting chemical properties compared to single fingerprints. Through the above research, the paper aims to provide a more efficient and accurate method for predicting chemical properties, especially when dealing with complex molecular structures.