Quality of Information-Based Source Assessment and Selection.

Yaojin Lin,Xuegang Hu,Xindong Wu
DOI: https://doi.org/10.1016/j.neucom.2013.11.027
IF: 6
2014-01-01
Neurocomputing
Abstract:Multiple information sources for the same set of objects can provide different representations, and combining their advantages may improve the predictive power for a given task. However, it is noticeable that some sources might be irrelevant or redundant. Thus, it is meaningful to select a set of good information sources that could help improve the learning performance, and very little work has been reported on this topic. In this paper, we first identify the two aspects of quality of information, source significance and source redundancy. In particular, significance represents the degree to which an information source contributes to the classification, and redundancy implies the information overlap among different information sources. We then propose a metric that combines neighborhood mutual information with a Max-Significance–Min-Redundancy algorithm, allowing us to select a compact set of superior information sources for classification learning. Extensive experiments show that the metric is very helpful in finding good information sources, and that the proposed method outperforms many other methods.
What problem does this paper attempt to address?