Abstract:Novelty detection in high-dimensional data is a challenging task due to the masking effect of irrelevant attributes. A common solution is to discover feature subspace, of which attributes are relevant to novelties. Due to the high uncertainty of novelties in practical applications, ensemble models that combine results from multiple subspaces are proved to be more effective than single models. According to the theory of bias–variance tradeoff, existing ensembles are often developed based on variance reduction. However, it is argued that the combination of poor detectors will deteriorate the performance of ensembles. To this end, this paper proposes an ensemble detector that takes into account variance and bias reduction simultaneously. Our ensemble is referred to as Selective Feature Bagging (SFB) since it is developed on the basis of Feature Bagging (FB). In order to improve the accuracy without deterioration of diversity of base detectors in FB, we resort to the notion of dynamic classifier selection which is proved be effective in classification. During the ensemble generation phase, base detectors are produced and categorized into different groups that are distinguished by the dimensionality of subspace used for training. The purpose of such a design is to maintain the diversity. During the generation phase, the most competent base detector from each of groups is dynamically selected and used to make decision on the test pattern. The purpose of such a design is to enhance the accuracy. We verify the effectiveness of SFB on 15 data sets from KEEL repository. Experimental results have shown that SFB can statistically outperform FB. In addition, several state-of-the-art have also been outperformed by SFB.

Two approaches for novelty detection using random forest.

Privacy preserving and fast decision for novelty detection using support vector data description

A New Rotation Forest Ensemble Algorithm

Selective Feature Bagging of one-class classifiers for novelty detection in high-dimensional data

Probabilistic Modeling for Novelty Detection with Applications to Fraud Identification

The empirical impact of the nature of novelty detection

A New Random Forest Ensemble of Intuitionistic Fuzzy Decision Trees

Random Similarity Forests

Text Classification with Novelty Detection

Variation of the glass transition temperature with rigidity and chemical composition

The nature of novelty detection

Random Forests for Adaptive Nearest Neighbor Estimation of Information-Theoretic Quantities

The active leaning-based nearest neighbor mean distance novelty detection for large data set

Active Learning Based Support Vector Data Description Method for Robust Novelty Detection

Self-Supervised Random Forest on Transformed Distribution for Anomaly Detection

Isolation Forest Based Anomaly Detection Framework on Non-IID Data

Open Set Recognition for Random Forest

Novelty Detection and Online Learning for Chunk Data Streams

A Novel Consistent Random Forest Framework: Bernoulli Random Forests

Nonparametric feature selection by random forests and deep neural networks

A Tsetlin Machine Framework for Universal Outlier and Novelty Detection