Adaptive active learning through k-nearest neighbor optimized local density clustering

Xia Ji,WanLi Ye,XueJun Li,Peng Zhao,Sheng Yao
DOI: https://doi.org/10.1007/s10489-022-04169-w
IF: 5.3
2022-01-01
Applied Intelligence
Abstract:Active learning iteratively constructs a refined training set to train an effective classifier with as few labeled instances as possible. In areas where labeling is expensive, active learning plays an important and irreplaceable role. The main challenge of active learning is to correctly identify critical samples. One of the current mainstream methods is to mine the potential data structure based on clustering and then identify key instances. However, the existing methods all adopt deterministic strategies, and the number of key samples is only related to the number of samples to be classified. The internal structure information of the sample clusters to be classified is not used. After analysis and verification, this deterministic key sample selection strategy has serious label waste. This is a serious problem that urgently needs to be solved in active learning. To this end, we propose an adaptive active learning algorithm based on density clustering (AAKC). Firstly, we introduce k-nearest neighbor information to redefine the local density of the instance. The new sample density can clearly express the local structural information of the sample. Secondly, we developed an adaptive key instance selection strategy based on the k-nearest neighbor sample density, which can adaptively select the necessary number of instance queries according to the structural information of the instance clusters to be classified, avoiding label waste. The experimental results of comparison with other algorithms show that our algorithm uses fewer labels to achieve better classification accuracy and has excellent stability.
What problem does this paper attempt to address?