Improving Active Learning by Data Balance to Reduce Annotation Efforts

Han Lei,Shuai Wang,Dezhi Zheng,Xiaolei Qu,Shangchun Fan,Chongyang Cui
DOI: https://doi.org/10.1049/joe.2018.9076
2019-01-01
Journal of Engineering
Abstract:Image classification is a fundamental task in image analysis. Recent advances in deep learning have achieved promising results on many image classification benchmarks. However, in some particular tasks, especially in biomedical image analysis, preparing a large number of labelled images for the model's training is costly and unpractical. In this study, the authors aim to address the following questions: With limited effort (e.g. time, cost and manpower) for labelling, what instances should be chosen to annotate and how to train to model using limited annotated data. For that, they present an active learning algorithm combining with data balancing, making the model (e.g. convolutional neural network) fine-tuned continuously and incrementally to reduce the effort of labelling and making model's training process more robust and efficient in both binary and multi-class classification with high-performance. They have evaluated the authors' method of both binary natural dataset and three classes biomedical dataset, demonstrating that active learning with data balancing could help models' training more robust and broaden active learning's field to multi classification and more application scenarios. More significantly, their experiments showed that at least a half of effort in labelling could be saved for satisfied performance by their method.
What problem does this paper attempt to address?