Machine Learning Enables Highly Accurate Predictions of Photophysical Properties of Organic Fluorescent Materials: Emission Wavelengths and Quantum Yields

Cheng-Wei Ju,Hanzhi Bai,Bo Li,Rizhang Liu
DOI: https://doi.org/10.1021/acs.jcim.0c01203
IF: 6.162
2021-02-23
Journal of Chemical Information and Modeling
Abstract:The development of functional organic fluorescent materials calls for fast and accurate predictions of photophysical parameters for processes such as high-throughput virtual screening, while the task is challenged by the limitations of quantum mechanical calculations. We establish a database covering &gt;4300 solvated organic fluorescent dyes with 3000 distinct compounds and develop a new machine learning approach aimed at efficient and accurate predictions of emission wavelength and photoluminescence quantum yield (PLQY). Our feature engineering has given rise to a functionalized structure descriptor (FSD) and a comprehensive general solvent descriptor (CGSD), whereby a highly black-box computational framework is realized with consistently good accuracy across different dye families, ability of describing substitution effects and solvent effects, efficiency for large-scale predictions, and workability with on-the-fly learning. Evaluations with unseen molecules suggest a remarkable mean absolute error of 0.13 for PLQY and 0.080 eV for emission energy, the latter comparable to time-dependent density functional theory (TD-DFT) calculations. An online prediction platform was constructed based on the ensemble model to make predictions in various solvents. Our statistical learning methodology will complement quantum mechanical calculations as an efficient alternative approach for the prediction of these parameters.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.0c01203?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.0c01203</a>.Detailed information about the method, ensemble learning model, solvent effect, and model performance (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c01203/suppl_file/ci0c01203_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems
What problem does this paper attempt to address?
The paper aims to address the challenges in predicting the photophysical properties of organic fluorescent materials, such as emission wavelength and photoluminescence quantum yield (PLQY). Specifically, the paper attempts to achieve rapid and accurate predictions of these properties through machine learning methods. Traditionally, the prediction of these properties relies on quantum mechanical calculations, but this approach has issues with high cost and complexity, especially when dealing with solvent effects and molecular substitution effects. Therefore, the goal of the paper is to develop a black-box computational framework with simple input, no need for prior knowledge of photophysical processes, consistent good accuracy across different dye families, the ability to describe substitution effects and necessary solvent effects, and suitability for large-scale predictions. To achieve this goal, the authors constructed a dataset containing over 4,300 experimental samples (about 3,000 different compounds) and developed a new machine learning method, including Functionalized Structure Descriptors (FSD) and Comprehensive General Solvent Descriptors (CGSD). Through this method, the authors achieved high-precision predictions of PLQY and emission energy and demonstrated its generalization ability on unseen molecules. Additionally, compared to traditional time-dependent density functional theory (TD-DFT) based computational methods, this method has significant advantages in terms of time and computational resources. Ultimately, the authors developed an online prediction platform that allows users to make predictions in various solvents. This statistical learning method will serve as an effective supplementary means to quantum mechanical calculations for predicting these parameters.