Domain-knowledge-oriented data pre-processing and machine learning of corrosion-resistant γ-U alloys with a small database

Junhao Yuan,Qing Wang,Zhen Li,Chuang Dong,Pengcheng Zhang,Xiangdong Ding
DOI: https://doi.org/10.1016/j.commatsci.2021.110472
IF: 3.572
2021-06-01
Computational Materials Science
Abstract:<p>The present work proposed a characteristic-parameter-embedded machine learning (ML) model to predict and design body-centered-cubic (BCC) γ-U alloys with high corrosion-resistant lifetime in 343 °C boiling water in U-Mo-Nb-Ti-Zr systems. The characteristic parameters of cluster formula approach and Mo equivalence (<em>Mo<sub>eq</sub></em>) were implemented into the ML for a more accurate prediction, in which the former reflects the interactions among elements and determines their added amounts and the latter represents the BCC-γ structural stability. The data samples in the current small-sample database of U alloys were first pre-processed with the guide of domain knowledge before ML, involving data screening and data weighting. Both auxiliary gradient-boosting regression tree (XGBR) and genetic algorithm (GA) methods were adopted to deal with the optimization problem during ML. The optimal compositions predicted by the ML with a screened &amp; weighted database in existing alloy systems are well consistent with the experimental results. Especially, a new quinary U-7.17Mo-0.96Nb-0.31Ti-0.28Zr (wt. %) alloy with a maximum corrosion-resistant lifetime of <em>D</em> = 190.4 days is achieved. Without the constraint of cluster formula to compositions, 158 alloys would be obtained by the ML when setting <em>D</em> ≥ 182 days, resulting in a complexity of experimental verification. This cluster-formula-embedded ML modelwith a domain-knowledge-oriented data pre-processing can optimize alloy compositions with desired properties in multi-component systems efficiently and precisely.</p>
materials science, multidisciplinary
What problem does this paper attempt to address?