Coherent Collections of Rules Describing Exceptional Materials Identified with a Multi-Objective Optimization of Subgroups

Lucas Foppa,Matthias Scheffler
2024-09-25
Abstract:Useful materials are often statistically exceptional and they might be overlooked by AI models that attempt to describe all materials simultaneously. These global models perform well for the majority of (useless) materials, but they do not necessarily capture the useful ones. Subgroup discovery (SGD) identifies rules describing subsets of materials (SGs) associated to exceptional values, e.g., high values, of a materials property of interest. Thus, SGD can better capture exceptional materials compared to most widely used AI techniques. Previous works focused on the SG that maximizes an objective function that establishes one tradeoff between the size of the SG and the exceptionality of the distribution of property values in the SG. However, this optimization does not give a unique solution, but many SGs typically have similar objective-function values. Here, we identify a Pareto region of SGs presenting a multitude of size-exceptionality tradeoffs. The approach is demonstrated by the learning of rules describing perovskites with high bulk modulus. These rules are used to screen a large space of perovskites and to efficiently identify materials with bulk modulus up to 13 % higher than the highest value of the training set.
Materials Science
What problem does this paper attempt to address?