Knowledge Reduction and Discovery based on Demarcation Information
Yuguo He
DOI: https://doi.org/10.48550/arXiv.cs/0405104
2004-05-27
Abstract:Knowledge reduction, includes attribute reduction and value reduction, is an important topic in rough set literature. It is also closely relevant to other fields, such as machine learning and data mining. In this paper, an algorithm called TWI-SQUEEZE is proposed. It can find a reduct, or an irreducible attribute subset after two scans. Its soundness and computational complexity are given, which show that it is the fastest algorithm at present. A measure of variety is brought forward, of which algorithm TWI-SQUEEZE can be regarded as an application. The author also argues the rightness of this measure as a measure of information, which can make it a unified measure for "differentiation, a concept appeared in cognitive psychology literature. Value reduction is another important aspect of knowledge reduction. It is interesting that using the same algorithm we can execute a complete value reduction efficiently. The complete knowledge reduction, which results in an irreducible table, can therefore be accomplished after four scans of table. The byproducts of reduction are two classifiers of different styles. In this paper, various cases and models will be discussed to prove the efficiency and effectiveness of the algorithm. Some topics, such as how to integrate user preference to find a local optimal attribute subset will also be discussed.
Machine Learning,Databases,Information Theory