Abstract:It is very expensive and time-consuming to annotate huge amounts of data. Active learning would be a suitable approach to minimize the effort of annotation. A novel active learning approach, coupled K nearest neighbor pseudo pruning (CKNNPP), is proposed in the paper, which is based on querying examples by KNNPP method. The KNNPP method applies k nearest neighbor technique to search for k neighbor samples from labeled samples of unlabeled samples. When k labeled samples are not belong to the same class, the corresponded unlabeled sample is queried and given its right label by supervisor, and then it is added to labeled training set. In contrast with the previous depiction, the unlabeled sample is not selected and pruned, that is the pseudo pruning. This definition is enlightened from the K nearest neighbor pruning preprocessing. These samples selected by KNNPP are considered to be near or on the optimal classification hyperplane that is crucial for active learning. Especially, in order to avoid the excursion of the optimal classification hyperplane after adding a queried sample, CKNNPP method is proposed finally that two samples with different class label (like a couple, annotated by supervisor) are queried by KNNPP and added in the training set simultaneously for updating training set in each iteration. The CKNNPP can provide a good performance, and especially it is simple, effective, and robust, and can solve the classification problem with unbalanced dataset compared with the existing methods. Then, the computational complexity of CKNNPP is analyzed. Additionally, a new stopping criterion is applied in the proposed method, and the classifier is implemented by Lagrangian Support Vector Machines in iterations of active learning. Finally, twelve UCI datasets, image datasets of aircrafts, and the dataset of radar high-resolution range profile are used to validate the feasibility and effectiveness of the proposed method. The results illuminate that CKNNPP gains superior performance compared with the other seven state-of-the-art active learning approaches.

Adaptive active learning through k-nearest neighbor optimized local density clustering

Active Learning Through Two-Stage Clustering.

Active learning based on K-nearest neighbor density of local sparse

Active Learning Through Density Clustering

Distributed Active Learning.

Active and Passive Nearest Neighbor Algorithm: A Newly-Developed Supervised Classifier

A Novel Density Peaks Clustering Algorithm Based on K Nearest Neighbors with Adaptive Merging Strategy

Improved KNN Algorithm based on Probability and Adaptive K Value.

Active learning based on coupled KNN pseudo pruning

Three-way Active Learning Through Clustering Selection.

An Adaptive Large Margin Nearest Neighbor Classification Algorithm

Tri-partition Cost-Sensitive Active Learning Through Knn

Clustering Based on Adaptive Local Density with Evidential Assigning Strategy

A Locally Adaptive Multi-Label k-Nearest Neighbor Algorithm.

New Balanced Active Learning Model and Optimization Algorithm.

Active Learning Through Label Error Statistical Methods

Robust Dimension Reduction for Clustering with Local Adaptive Learning.

Multiple Kernel Active Learning for Image Classification

A New Locally Adaptive K-Nearest Neighbor Algorithm Based on Discrimination Class

A novel density-based clustering algorithm using nearest neighbor graph

Clustering Environment Aware Learning for Active Domain Adaptation