Feature Engineering and Selection: A Practical Approach for Predictive Models

Brandon Butcher,Brian J. Smith
DOI: https://doi.org/10.1080/00031305.2020.1790217
2020-07-02
The American Statistician
Abstract:In recent years, the terms machine learning, data science, and predictive modeling have become ubiquitous in nearly every discipline in which data analysis plays a central role. When first diving into the machine learning world, one will invariably encounter the terms <i>feature engineering</i> and <i>feature selection</i>. <i>Feature Engineering and Selection: A Practical Approach for Predictive Models</i> (FES) endeavors to provide a much needed resource for elucidating these enigmatic terms. In the preface, Kuhn and Johnson express their frustration that models can have unsatisfactory predictive performance despite following predictive modeling best practices as detailed in their previous work, (Kuhn and Johnson 2013), <i>Applied Predictive Modeling</i> (APM). Often, unsatisfactory predictive performance of a model has a simple, but difficult to remedy explanation: predictors or independent variables are represented in the dataset in a manner that make it difficult for models to uncover the signal from the noise.
What problem does this paper attempt to address?