Abstract:We study the problem of balancing effectiveness and efficiency in automated feature selection. Feature selection is to find an optimal feature subset from large feature space. After exploring many feature selection methods, we observe a computational dilemma: 1) traditional feature selection (e.g., mRMR) is mostly efficient, but difficult to identify the best subset; 2) the emerging reinforced feature selection automatically navigates feature space to search the best subset, but is usually inefficient. Are automation and efficiency always apart from each other? Can we bridge the gap between effectiveness and efficiency under automation? Motivated by this dilemma, we aim to develop a novel feature space navigation method. In our preliminary work, we leveraged interactive reinforcement learning to accelerate feature selection by external trainer-agent interaction. Our preliminary work can be significantly improved by modeling the structured knowledge of its downstream task (e.g., decision tree) as learning feedback. In this journal version, we propose a novel interactive and closed-loop architecture to simultaneously model interactive reinforcement learning (IRL) and decision tree feedback (DTF). Specifically, IRL is to create an interactive feature selection loop and DTF is to feed structured feature knowledge back to the loop. The DTF improves IRL from two aspects. First, the tree-structured feature hierarchy generated by decision tree is leveraged to improve state representation. In particular, we represent the selected feature subset as an undirected graph of feature-feature correlations and a directed tree of decision features. We propose a new embedding method capable of empowering Graph Convolutional Network (GCN) to jointly learn state representation from both the graph and the tree. Second, the tree-structured feature hierarchy is exploited to develop a new reward scheme. In particular, we personalize reward assignment of agents based on decision tree feature importance. In addition, observing agents’ actions can also be a feedback, we devise another new reward scheme, to weigh and assign reward based on the selected frequency ratio of each agent in historical action records. Finally, we present extensive experiments with real-world datasets to demonstrate the improved performances of our method.

Geometric Heuristics for Transfer Learning in Decision Trees

DOT: Towards Fast Decision Tree Packet Classification by Optimizing Rule Partitions

Transferring knowledge from human-demonstration trajectories to reinforcement learning

Tree-like Decision Distillation

Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making

Efficient non-greedy optimization of decision trees

Online Local Boosting: Improving Performance in Online Decision Trees

TLRec:Transfer Learning for Cross-Domain Recommendation

Automatic Induction of Neural Network Decision Tree Algorithms

Fair Adversarial Gradient Tree Boosting

Traffic Prediction with Transfer Learning: A Mutual Information-based Approach

Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded

Learning a Decision Tree Algorithm with Transformers

An improved column-generation-based matheuristic for learning classification trees

Inducing Semantic Hierarchy Structure in Empirical Risk Minimization with Optimal Transport Measures

Towards Optimally Efficient Tree Search with Deep Learning

A Novel Transfer Learning Method for Fault Diagnosis Using Maximum Classifier Discrepancy With Marginal Probability Distribution Adaptation

Better trees: an empirical study on hyperparameter tuning of classification decision tree induction algorithms

Modifying boosted trees to improve performance on task 1 of the 2006 KDD challenge cup

Interactive Reinforcement Learning for Feature Selection with Decision Tree in the Loop