i-SISSO: Mutual information-based improved sure independent screening and sparsifying operator algorithm

Yuqin Xu,Quan Qian
DOI: https://doi.org/10.1016/j.engappai.2022.105442
IF: 8
2022-11-01
Engineering Applications of Artificial Intelligence
Abstract:Symbolic regression is a method to extract quantitative expressions from dataset and has already been applied in different research fields. The sure independence screening and sparsifying operation (SISSO) is a powerful data-driven method widely applied in various fields of material research or other research directions and can handle large candidate descriptor spaces and generate prediction models in the form of equations with high accuracy. However, SISSO is confronted by the combinatorial explosion phenomenon when enumerating each case to decide the sparsing representation with an ensured L0-norm. This study uses the Max-Relevance and Min-Redundancy (mRMR) algorithm to solve the problem arising from the combinatorial explosion in SISSO by reducing the candidate descriptor space before the SO phase and shrinking the possible search space without losing excess information. Experimental results on six different public datasets and 12 predictive targets show that, when compared with SISSO, mutual information based SISSO significantly reduced time consumption and maintained the error of the model close to the model generated by SISSO, demonstrating that the optimal strategy works on it. Hence, the time cost decreases to approximately 11000, and the predictive accuracy remains steady.
automation & control systems,computer science, artificial intelligence,engineering, electrical & electronic, multidisciplinary
What problem does this paper attempt to address?