Semi-supervised deep learning based on label propagation in a 2D embedded space

Barbara Caroline Benato,Jancarlo Ferreira Gomes,Alexandru Cristian Telea,Alexandre Xavier Falcão
DOI: https://doi.org/10.48550/arXiv.2008.00558
2021-01-15
Abstract:While convolutional neural networks need large labeled sets for training images, expert human supervision of such datasets can be very laborious. Proposed solutions propagate labels from a small set of supervised images to a large set of unsupervised ones to obtain sufficient truly-and-artificially labeled samples to train a deep neural network model. Yet, such solutions need many supervised images for validation. We present a loop in which a deep neural network (VGG-16) is trained from a set with more correctly labeled samples along iterations, created by using t-SNE to project the features of its last max-pooling layer into a 2D embedded space in which labels are propagated using the Optimum-Path Forest semi-supervised classifier. As the labeled set improves along iterations, it improves the features of the neural network. We show that this can significantly improve classification results on test data (using only 1\% to 5\% of supervised samples) of three private challenging datasets and two public ones.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?