MolSets: Molecular Graph Deep Sets Learning for Mixture Property Modeling

Hengrui Zhang,Jie Chen,James M. Rondinelli,Wei Chen
DOI: https://doi.org/10.1103/PRXEnergy.3.023006
2023-12-27
Abstract:Recent advances in machine learning (ML) have expedited materials discovery and design. One significant challenge faced in ML for materials is the expansive combinatorial space of potential materials formed by diverse constituents and their flexible configurations. This complexity is particularly evident in molecular mixtures, a frequently explored space for materials such as battery electrolytes. Owing to the complex structures of molecules and the sequence-independent nature of mixtures, conventional ML methods have difficulties in modeling such systems. Here we present MolSets, a specialized ML model for molecular mixtures. Representing individual molecules as graphs and their mixture as a set, MolSets leverages a graph neural network and the deep sets architecture to extract information at the molecule level and aggregate it at the mixture level, thus addressing local complexity while retaining global flexibility. We demonstrate the efficacy of MolSets in predicting the conductivity of lithium battery electrolytes and highlight its benefits in virtual screening of the combinatorial chemical space.
Machine Learning,Materials Science
What problem does this paper attempt to address?
The paper attempts to address the issue that traditional machine learning methods struggle to accurately predict the properties of molecular mixtures (such as battery electrolytes) due to the complexity of molecular structures and the diversity of mixture compositions. Specifically, the paper focuses on how to effectively represent molecular mixtures in the vast chemical space and predict the performance of these mixtures through machine learning models, particularly the conductivity of lithium-ion battery electrolytes. The paper proposes a new model called MolSets, which combines Graph Neural Networks (GNN) and Deep Sets architecture to overcome the limitations of existing methods in handling multi-component molecular mixtures, such as high combinatorial complexity and inaccurate representation. MolSets aims to improve the accuracy and efficiency of predicting the properties of molecular mixtures by capturing chemical and geometric information at the molecular level and maintaining the permutation invariance of the mixtures.