Abstract:Despite their benefits in terms of simplicity, low computational cost and data requirement, parametric machine learning algorithms, such as linear discriminant analysis, quadratic discriminant analysis or logistic regression, suffer from serious drawbacks including linearity, poor fit of features to the usually imposed normal distribution and high dimensionality. Batch kernel-based nonparametric classifier, which overcomes the linearity and normality of features constraints, represent an interesting alternative for supervised classification problem. However, it suffers from the ``curse of dimension". The problem can be alleviated by the explosive sample size in the era of big data, while large-scale data size presents some challenges in the storage of data and the calculation of the classifier. These challenges make the classical batch nonparametric classifier no longer applicable. This motivates us to develop a fast algorithm adapted to the real-time calculation of the nonparametric classifier in massive as well as streaming data frameworks. This online classifier includes two steps. First, we consider an online principle components analysis to reduce the dimension of the features with a very low computation cost. Then, a stochastic approximation algorithm is deployed to obtain a real-time calculation of the nonparametric classifier. The proposed methods are evaluated and compared to some commonly used machine learning algorithms for real-time fetal well-being monitoring. The study revealed that, in terms of accuracy, the offline (or Batch), as well as, the online classifiers are good competitors to the random forest algorithm. Moreover, we show that the online classifier gives the best trade-off accuracy/computation cost compared to the offline classifier.

Online learning for streaming data classification in nonstationary environments

New algorithm for online classification over data streams based on max-frequency patterns

Clustering-based Active Learning Classification towards Data Stream

A Framework for On-Demand Classification of Evolving Data Streams

Discovering an Evolutionary Classifier over a High-speed Nonstatic Stream

SACCOS: A Semi-Supervised Framework for Emerging Class Detection and Concept Drift Adaption Over Data Streams

Online Active Learning for Drifting Data Streams

Online Boosting Adaptive Learning under Concept Drift for Multistream Classification

Online updating mode learning for streaming datasets

On Demand Classification of Data Streams

Streaming Classification with Emerging New Class by Class Matrix Sketching.

Non-stationary data sequence classification using online class priors estimation

A Framework of Sparse Online Learning and Its Applications

A Framework of Online Learning with Imbalanced Streaming Data.

Imbalanced Data Stream Classification using Dynamic Ensemble Selection

Online User Modeling for Interactive Streaming Image Classification.

An Automatic Construction and Organization Strategy for Ensemble Learning on Data Streams

Online ensemble learning algorithm for imbalanced data stream

Online Learning from Incomplete Data Streams for Multi-classification

Online Nonparametric Supervised Learning for Massive Data

A grid density based framework for classifying streaming data in the presence of concept drift