Evolutionary Automated Feature Engineering

Guanghui Zhu,Shen Jiang,Xu Guo,Chunfeng Yuan,Yihua Huang
DOI: https://doi.org/10.1007/978-3-031-20862-1_42
2022-01-01
Abstract:Effective feature engineering serves as a prerequisite for many machine learning tasks. Feature engineering, which usually uses a series of mathematical functions to transform the features, aims to find valuable new features that can reflect the insight aspect of data. Traditional feature engineering is a labor-intensive and time-consuming task, which depends on expert domain knowledge and requires iterative manner with trial and error. In recent years, many automated feature engineering (AutoFE) methods have been proposed. These methods automatically transform the original features to a set of new features to improve the performance of the machine learning model. However, existing methods either suffer from computational bottleneck, or do not support high-order transformations and various feature types. In this paper, we propose EAAFE, to the best of our knowledge, the first evolutionary algorithm-based automated feature engineering method. We first formalize the AutoFE problem as a search problem of the optimal feature transformation sequence. Then, we leverage roulette wheel selection, subsequence-exchange-based DNA crossover, and E-greedy-based DNA mutation to achieve evolution. Despite its simplicity, EAAFE is flexible and effective, which can not only support feature transformations for both numerical and categorical features, but also support high-order feature transformations. Extensive experimental results on public datasets demonstrate that EAAFE outperforms the existing AutoFE methods in both effectiveness and efficiency.
What problem does this paper attempt to address?