Structured Data Encoder for Neural Networks Based on Gradient Boosting Decision Tree.

Wenhui Hu,Xueyang Liu,Yu Huang,Yu Wang,Minghui Zhang,Hui Zhao
DOI: https://doi.org/10.1007/978-3-030-60239-0_41
2020-01-01
Abstract:Features are very important for machine learning tasks, therefore, feature engineering has been widely adopted to obtain effective handcrafted features, which is, however, labor-intensive and in need of expert knowledge. Therefore, feature learning using neural networks has been used to obviate the need of manual feature engineering and achieved great successes in the image and sequential data processing. However, its performance in processing structured data is usually unsatisfactory. In order to tackle this problem and learn good feature representations for structured data, in this work, we propose a structured data encoder (SDE) based on Gradient Boost Decision Tree (GBDT) to learn feature representations from structured data both effectively and efficiently. Then, PCA is further employed to extract the most useful information and to reduce the dimensionality for the following classification or regression tasks. Extensive experimental studies have been conducted to show the superior performances of the proposed SDE solution in learning representations of structured data.
What problem does this paper attempt to address?