Study on Entire-Granulation Rough Sets and Concept Drifting in a Knowledge System
Da-Yong DENG,Ke-Wen LU,Duo-Qian MIAO,Hou-Kuan HUANG
DOI: https://doi.org/10.11897/SP.J.1016.2019.00085
2019-01-01
Chinese Journal of Computers
Abstract:Concept drifting detection is one of the hot topics in data stream mining, and analysis of uncertainty is dominant in rough set theory.There exist the change of uncertainty and concept drifting in big data and data stream.However, except for F-rough sets, almost all of rough set models are static models or semi-dynamic models, which study on vagueness and uncertainty.It is hard for them to deal with the change of uncertainty, and to detect concept drifting.Combined with the ideas of quantum computing, data stream, concept drifting, rough sets and F-rough sets, a rough set model for entire granulations (called entire-granulation rough sets) is presented, and a lot of concepts, such as concept drifting of upper approximation, concept drifting of lower approximation, coupling of upper approximation and coupling of lower approximation, etc.are defined.The properties of entire-granulation rough sets are investigated, and the change of uncertainty for a concept in a knowledge system is analyzed with these definitions.Entiregranulation rough sets inherit the basic ideas of Pawlak rough sets and F-rough sets, which describe all of the changes of uncertainty for a concept with a family of upper approximations and lower approximations.Embedded Hasse diagram is employed to express the identity and diversity for a concept in different cases:There exists no concept drifting for the same level of concept expressions but exists concept drifting for the different levels of concept expressions.With the positive region, the positive region for entire granulations is defined, and concept drifting, concept coupling are defined in a decision system.The properties of entire-granulation positive region are discussed, and the analysis and measurement for the change of concept uncertainty are conducted.Entire-granulation positive region expresses all of the positive regions in various cases in a decision system.Embedded Hasse diagram is also employed to express the identity and diversity for the family of positive regions:There exists no concept drifting relative to positive region for the same level of concepts, but exists concept drifting relative to positive region for different levels of concepts.In entire granulation rough sets, entire-granulation absolute reducts, entire-granulation value reducts and entire-granulation Pawlak reducts are defined, and their properties are investigated.Not like most types of attribute reducts (just like parallel reducts and mutil-granulation conditional attribute reducts), entire-granulation conditional attribute reducts ask for no concept drifting for all of concept expressions.The advantages and faults of conditional attribute reduction are further investigated:The unicity of concept expressions is done when condition attribute reduct is conducted, while the redundant conditional attributes can make concept expression more diversified.From the viewpoints of epistemology, the wholeness and locality of human thinking are further analyzed with granular computing and rough sets.To some extent, entire-granulation rough sets can express complexity, uncertainty, diversity, hierarchy and dynamic in the process of human cognition.With the help of quantum computing, the model of entire-granulation rough sets can transform one type of granulation to another fluently.The study on entire-granulation rough sets and concept drifting detection among them can provide heuristic information for various concept drifting detection and simulation of human intelligence.