Sampling for Approximate Reduct in very Large Datasets

Keyun Hu,Lili Diao,Yuchang Lu,Chunyi Shi
2000-01-01
Abstract:The rough set theory provides a formal framework for data mining. Reduct is the most important concept in rough set application to data mining. A reduct is the minimal attribute set preserving classification power of original dataset. Finding a reduct is similar to feature selection problem. In this paper, we propose two reduct algorithms. One is based on attribute frequency in discernibility matrix. Another uses similar idea and sampling techniques for large datasets. Empirical analysis shows that both algorithms are efficient.
What problem does this paper attempt to address?