Robust Prototype-Based Learning on Data Streams

Junming Shao,Feng Huang,Qinli Yang,Guangchun Luo
DOI: https://doi.org/10.1109/tkde.2017.2772239
IF: 9.235
2018-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:In this paper, we propose a prototype-based classification model for evolving data streams, called SyncStream, which allows dynamically modeling time-changing concepts, making predictions in a local fashion. Instead of learning a single model on a fixed or adaptive sliding window of historical data or ensemble learning a set of weighted base classifiers, SyncStream captures evolving concepts by dynamically maintaining a set of prototypes in a proposed P-Tree, which are obtained based on the error-driven representativeness learning and synchronization-inspired constrained clustering. To identify abrupt concept drifts in data streams, PCA and statistical analysis based heuristic approaches have been introduced. To further learn the associations among distributed data streams, the extended P-Tree structure and KNN-style strategy are introduced. We demonstrate that our new data stream classification approach has several attractive benefits: (a) SyncStream is capable of dynamically modeling the evolving concepts from even a small set of prototypes. (b) Owing to synchronization-based constrained clustering and P-Tree, SyncStream supports efficient and effective data representation and maintenance. (c) SyncStream is also tolerant of inappropriate or noisy examples via error-driven representativeness learning. (d) SyncStream allows learning relationship among distributed data streams at the instance level. The experimental results indicate its efficiency and effectiveness.
What problem does this paper attempt to address?