Learning adaptive criteria weights for active semi-supervised learning

Hao Li,Yongli Wang,Yanchao Li,Gang Xiao,Peng Hu,Ruxin Zhao,Bo Li
DOI: https://doi.org/10.1016/j.ins.2021.01.045
IF: 8.1
2021-06-01
Information Sciences
Abstract:<p>Batch mode active learning (BMAL) is devoted to training trustful learning models with scarce labeled samples by efficiently asking the ground truth annotations of the most beneficial unlabeled points for supervision with the feedback of an expert. Particularly, BMAL algorithms always sample points based on the decent-designed criteria, such as <em>(un)certainty</em> and <em>representativeness</em>, etc. However, present BMAL approaches consistently are afflicted with one limitation: They simply integrate the sampling criteria with fixed weights to select instances for supervised training, which may yield suboptimal batch acquisition since the criteria values of the plentiful candidate unlabeled samples would fluctuate after retraining the classifier with the newly augmented training set. Instead, the weights of sampling criteria should be allocated appropriately. To overcome this problem, this work proposes a novel <strong>A</strong>daptive <strong>C</strong>riteria <strong>W</strong>eights batch selection algorithm, abbreviated ACW, which dynamically adjusts the importance of (un)certainty and representativeness to choose critical instances for semi-supervised learning. A submodular function is employed to recognize a diverse mini-batch from the selected batch of samples. We apply our proposed ACW batch sampling algorithm to two types of essential semi-supervised tasks, i.e., semi-supervised classification and semi-supervised clustering. To the best of our knowledge, this work is the first devoted attempt to explore adaptive mechanism of criteria weights in the context of active learning. The superiority and effectiveness of ACW against the present state-of-the-art BMAL approaches have also been demonstrated by the encouraging experimental results.</p>
computer science, information systems
What problem does this paper attempt to address?