General Assembly Framework for Online Streaming Feature Selection Via Rough Set Models

Peng Zhou,Yunyun Zhang,Peipei Li,Xindong Wu
DOI: https://doi.org/10.1016/j.eswa.2022.117520
IF: 8.5
2022-01-01
Expert Systems with Applications
Abstract:We may not know the entire feature space in advance for real-world applications, and features can exist in a stream mode, called streaming features. Online streaming feature selection aims to select optimal streaming features on the fly and can be summarized into three main components: irrelevant feature discarding, relevant feature selecting, and redundant feature removing. Therefore, the core issue of the streaming feature selection framework is the calculation of the relationship between features. This paper applies Rough Set models to discover the feature relationships for the most crucial advantages: they do not require any domain knowledge and can measure the selected features as integral. After the formal definitions of feature relevance, irrelevance, and redundancy from the Rough Set perspective, we analyze and abstract the feature relationship calculation from three levels: Rough Set model, positive region, and consistency calculation. Then we design a novel general assembly Rough Set based Streaming Feature Selection Framework, named RS-SFSF, which could assemble new algorithms for different problems step by step. Researchers in different areas can quickly build the algorithms they need based on our new framework. To demonstrate the effectiveness of RS-SFSF, we derived four new algorithms based on RS-SFSF by using the classical Rough Set model, neighborhood Rough Set model, and fuzzy Rough Set model, respectively. Extensive experiments conducted on twelve real-world datasets indicate the efficiency of our new framework.
What problem does this paper attempt to address?