Machine Learning Approaches for Determining Molecular Packing of Organic Semiconductors: Toward Accurate Crystal Structure Prediction

Go Watanabe,Takuya Seki,Yudai Shinozaki,Ryosuke Ito,Jun Takeya,Toshihiro Okamoto,Shunsuke Sato
DOI: https://doi.org/10.26434/chemrxiv-2024-k3652
2024-05-31
Abstract:A proposed machine learning model can determine with high accuracy which of two types of two-dimensional molecular packing, herringbone or brickwork, an organic semiconductor forms. The combination of molecular simulations with the machine learning model has the potential to predict the crystal structure of organic molecules.
Chemistry
What problem does this paper attempt to address?
This paper proposes a solution to the prediction of two-dimensional (2D) stacking structures of organic semiconductor (OSC) molecules. Currently, high-performance p-type and n-type OSCs tend to form two different 2D stacking modes: herringbone (HB) and brick-wall (BW). However, these structures cannot be accurately predicted by theory, which limits the design of new high carrier mobility OSC molecules. In this paper, the researchers developed a machine learning model that can accurately determine whether an OSC molecule will form an HB or BW type of 2D stacking. Combined with molecular simulation methods, this model is expected to predict the crystal structures of organic molecules. They built a database containing 120 OSC molecules and used molecular mechanics (MM) calculations and molecular dynamics (MD) simulations, as well as MACCS keys and Mordred molecular fingerprints as feature descriptors to train and validate the model. After evaluation, the model using LightGBM and MACCS keys showed high accuracy in distinguishing HB and BW stacking. Furthermore, the researchers used feature importance and SHAP analysis to identify key molecular fragments that influence the 2D stacking structure. They found that certain fragments are crucial for the formation of BW stacking, while methyl bridge fragments contribute to HB stacking. With this approach, the number of candidate crystal structures for prediction can be reduced, leading to improved prediction accuracy. Overall, this paper aims to address the prediction of 2D stacking structures of OSC molecules using machine learning and molecular simulation techniques, providing strategies for designing new efficient organic semiconductor materials.