Optimal Feature Subset with Positive Region Constraints

Jun-Xia Mu,Hong Zhao,William Zhu
DOI: https://doi.org/10.1109/icmlc.2015.7340901
2015-01-01
Abstract:Cost-sensitive feature selection is an active and important research topic in data mining and machine learning. In many applications, the test cost of collecting features must be taken into account. Therefore, the optimal test cost feature selection with positive region constraints is an important research topic in cost-sensitive learning. To address this issue, a λ-weighted information gain algorithm has been proposed. However, this algorithm is computationally time consuming and does not produce optimal solution in most cases. To overcome these shortcomings, in this work, we design a β weighted heuristic algorithm to solve the optimal test cost feature selection with positive region constraints problem. More specifically, two major issues are addressed with regard to the β weighted heuristic algorithm. The first one takes the advantage of the test costs information and the information gain. The other one relates to a positive number β which is the only parameter selected by the user. Weights are decided by test costs and β. The proposed algorithm is compared using six datasets from UCI and a representative test cost distribution. Experimental results show that the proposed algorithm is more effective and efficient.
What problem does this paper attempt to address?