SDTR: Soft Decision Tree Regressor for Tabular Data

Haoran Luo,Fan Cheng,Heng Yu,Yuqi Yi
DOI: https://doi.org/10.1109/access.2021.3070575
IF: 3.9
2021-01-01
IEEE Access
Abstract:Deep neural networks have been proved a success in multiple fields. However, researchers still favor traditional approaches to obtain more interpretable models, such as Bayesian methods and decision trees when processing heterogeneous tabular data. Such models are hard to differentiate, thus inconvenient to be integrated into end-to-end settings. On the other hand, traditional neural networks are differentiable but perform poorly on tabular data. We propose a hierarchical differentiable neural regression model, Soft Decision Tree Regressor (SDTR). SDTR imitates a binary decision tree by a differentiable neural network and is plausible for ensemble schemes like bagging and boosting. The SDTR method was evaluated on multiple tabular-based regression tasks (YearPredictionMSD, MSLR-Web10K, Yahoo LETOR, SARCOS and Wine quality). Its performance is comparable with non-differentiable models (gradient boosting decision trees) and better than uninterpretable models (regular FCNN). On top of that, it can produce fair results with a restricted number of parameters, only using a small forest or even a single tree. We also propose an “average entropy” metric to evaluate the level of interpretability of a trained, soft decision tree neural network. This metric also helps to select proper structure and hyperparameters for such networks.
What problem does this paper attempt to address?