Classification with Streaming Features: an Emerging-Pattern Mining Approach

Kui Yu,Wei Ding,Dan A. Simovici,Hao Wang,Jian Pei,Xindong Wu
DOI: https://doi.org/10.1145/2700409
IF: 4.157
2015-01-01
ACM Transactions on Knowledge Discovery from Data
Abstract:Many datasets from real-world applications have very high-dimensional or increasing feature space. It is a new research problem to learn and maintain a classifier to deal with very high dimensionality or streaming features. In this article, we adapt the well-known emerging-pattern--based classification models and propose a semi-streaming approach. For streaming features, it is computationally expensive or even prohibitive to mine long-emerging patterns, and it is nontrivial to integrate emerging-pattern mining with feature selection. We present an online feature selection step, which is capable of selecting and maintaining a pool of effective features from a feature stream. Then, in our offline step, separated from the online step, we periodically compute and update emerging patterns from the pool of selected features from the online step. We evaluate the effectiveness and efficiency of the proposed method using a series of benchmark datasets and a real-world case study on Mars crater detection. Our proposed method yields classification performance comparable to the state-of-art static classification methods. Most important, the proposed method is significantly faster and can efficiently handle datasets with streaming features.
What problem does this paper attempt to address?