How “small” Reflects “large”?—representative Information Measurement and Extraction

Guoqing Chen,Cong Wang,Mingyue Zhang,Qiang Wei,Baojun Ma
DOI: https://doi.org/10.1016/j.ins.2017.08.096
IF: 8.1
2017-01-01
Information Sciences
Abstract:While web services avail a rapid growth of data volume for use, identifying helpful information is of great value, especially when users face with an unwilling glut of information. Thus, it is deemed relevant and meaningful to provide users with a representative subset (i.e., small set) that could well reflect the original information corpus (i.e., large set). In such a large–small context, this paper addresses the issues of representativeness in light of measurement and extraction by reviewing our previous efforts. Specifically, we first discuss various metrics from different perspectives of representativeness, then present a series of related representativeness extraction methods. Finally as a supplement and extension, a recent effort is introduced, which aims to take information quality into account in deriving a ranked subset. The proposed extraction method is justified by extensive real-world data experiments, showing its superiority to others in both effectiveness and efficiency.
What problem does this paper attempt to address?