Lifting Interpretability-Performance Trade-off via Automated Feature Engineering

Alicja Gosiewska,Przemyslaw Biecek
DOI: https://doi.org/10.48550/arXiv.2002.04267
2020-02-11
Abstract:Complex black-box predictive models may have high performance, but lack of interpretability causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, achieving satisfactory accuracy of interpretable models require more time-consuming work related to feature engineering. Can we train interpretable and accurate models, without timeless feature engineering? We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models. New models are created on newly engineered features extracted with the help of a surrogate model. We supply the analysis by a large-scale benchmark on several tabular data sets from the OpenML database. There are two results 1) extracting information from complex models may improve the performance of linear models, 2) questioning a common myth that complex machine learning models outperform linear models.
Machine Learning
What problem does this paper attempt to address?