Research on Discernibility Matrix Knowledge Reduction Algorithm in Cloud Computing

QIAN Jin,MIAO Duo-qian,ZHANG Ze-hua
DOI: https://doi.org/10.3969/j.issn.1002-137x.2011.08.045
2011-01-01
Computer Science
Abstract:Knowledge reduction is one of the important research issues in rough set theory.Classical knowledge reduction algorithms can only deal with small datasets,while the existing parallel knowledge reduction algorithms assume all the datasets can be loaded into the main memory and only implement reduction tasks concurrently,which is infeasible for handling large-scale data.Massive data with high dimension makes attribute reduction a challenging task.To solve this problem,the characteristics of discernibility matrix cells were analyzed,and discernibility matrix for data parallel was designed in terms of the indiscernibility of the attribute(s) and MapReduce programming model.Thus,large-scale data oriented discernibility matrix knowledge reduction algorithm in cloud computing was proposed.The experimental results demonstrate that our proposed algorithm can scale well and efficiently process large-scale datasets on commodity computers.
What problem does this paper attempt to address?