Embedded feature selection for multi-label learning

Ge Lei,Li Guo-Zheng,You Ming-Yu
DOI: https://doi.org/10.3321/j.issn:0469-5097.2009.05.014
2009-01-01
Abstract:Dimensionality reduction is a key technique in the data mining field.Conventional dimensionality reduction methods are used only for single label learning.Although supervised dimensionality reduction has been studied widely,little work on multi-label dimensionality reduction has been done due to the complexity of multi-label learning.Direct application of existing unsupervised dimensionality reduction methods to multi-label tasks ignores the label information.We can deploy single label dimensionality reduction methods to solving multi-label problems by decomposing these problems into single label ones.However,this kind of method does not consider the correlations between the different labels of each instance.The only method for multi-label dimensionality reduction is MDDM(multi-label dimensionality reduction via dependency maximization),which is a feature extraction method.Multi-label feature selection remains almost untouched for the complexity of evaluating the goodness of features in multi-label learning.Therefore,an embedded method of feature selection named MEFS(multi-label embedded feature selection) is proposed,which employs the prediction risk criterion as evaluation metric for features.We perform two experiments on Yahoo web page categorization data sets which are widely used for benchmark evaluation:(1) analyzing the effectiveness of multi-label evaluation metrics on evaluating features;(2) comparing MEFS and MDDM,PCA(principal component analysis),LPP(locality preserving projections component analysis),and experiments show that the performance of MEFS is superior to that of some state-of-arts multi-label dimensionality reduction methods.
What problem does this paper attempt to address?