Machine learning classification of local environments in molecular crystals

Daisuke Kuroshima,Michael Kilgour,Mark E. Tuckerman,Jutta Rogal
2024-03-30
Abstract:Identifying local structural motifs and packing patterns of molecular solids is a challenging task for both simulation and experiment. We demonstrate two novel approaches to characterize local environments in different polymorphs of molecular crystals using learning models that employ either flexibly learned or handcrafted molecular representations. In the first case, we follow our earlier work on graph learning in molecular crystals, deploying an atomistic graph convolutional network, combined with molecule-wise aggregation, to enable per-molecule environmental classification. For the second model, we develop a new set of descriptors based on symmetry functions combined with a point-vector representation of the molecules, encoding information about the positions as well as relative orientations of the molecule. We demonstrate very high classification accuracy for both approaches on urea and nicotinamide crystal polymorphs, and practical applications to the analysis of dynamical trajectory data for nanocrystals and solid-solid interfaces. Both architectures are applicable to a wide range of molecules and diverse topologies, providing an essential step in the exploration of complex condensed matter phenomena.
Materials Science,Chemical Physics,Computational Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively identify the complex features of the local structural environment in molecular crystals. Specifically, the authors aim to develop two novel methods to characterize the local environments in different polymorphic molecular crystals, thereby achieving high - precision classification. This not only helps to understand the structural properties of molecular solids on the microscopic scale, but also provides an important tool for studying polymorphic transitions (especially the nucleation and growth processes). ### Main problems: 1. **Identifying local structural motifs and packing patterns in molecular solids**: Experimental methods are difficult to track the structural changes of condensed - phase systems at atomic resolution, so effective computational methods are required to characterize these features. 2. **Handling additional challenges in molecular systems**: It is necessary to consider not only the positions of molecules, but also their relative orientations and conformational changes. ### Solutions: The authors propose two parallel machine - learning methods: 1. **Learning feature embedding based on graph neural networks (GNN)**: Using atomic graph convolutional networks combined with molecular - level aggregation to achieve classification of each molecular environment. This method can learn relevant features bottom - up from basic atomic information. 2. **Classification based on hand - designed descriptors**: By expanding previous work, combining molecular symmetry functions with point - vector representations to construct descriptors and inputting them into a multi - layer perceptron (MLP) for classification. Both of these methods can distinguish the local environments of different polymorphs in complex molecular solids with high precision and are applicable to a wide range of molecules and topological structures, including clusters and interfaces. In addition, these models can also provide time - resolved information about melting transitions or solid - solid transitions. ### Application examples: - **Urea and nicotinamide crystals**: Demonstrated the classification performance of the model at different temperatures and verified its generalization ability on high - temperature simulation data. - **Dynamic structural characterization of nanocrystals**: Analyzed the structural evolution of nicotinamide nanocrystals at different temperatures. - **Time evolution of the solid - solid phase boundary**: Tracked the change in the interface position between two polymorphs of urea. In conclusion, this research provides a powerful tool for the structural analysis of molecular solids, which can accurately identify and classify local environments under complex conditions.