From Local Atomic Environments to Molecular Information Entropy

Alexander Croy
2024-01-17
Abstract:The similarity of local atomic environments is an important concept in many machine-learning techniques which find applications in computational chemistry and material science. Here, we present and discuss a connection between the information entropy and the similarity matrix of a molecule. The resulting entropy can be used as a measure of the complexity of a molecule. Exemplarily, we introduce and evaluate two specific choices for defining the similarity: one is based on a SMILES representation of local substructures and the other is based on the SOAP kernel. By tuning the sensitivity of the latter, we can achieve a good agreement between the respective entropies. Finally, we consider the entropy of two molecules in a mixture. The gain of entropy due to the mixing can be used as a similarity measure of the molecules. We compare this measure to the average and the best-match kernel. The results indicate a connection between the different approaches and demonstrate the usefulness and broad applicability of the similarity-based entropy approach.
Chemical Physics,Materials Science
What problem does this paper attempt to address?