Unsupervised Object Discovery Via Object-Centric Representation

Bingfei Fu,Xiangyang Xue
DOI: https://doi.org/10.1109/icme57554.2024.10687770
2024-01-01
Abstract:Unsupervised object discovery enables us to localize potential objects without any supervision, which has broad application prospects such as detecting skin disease and monitoring water waste, etc. However, existing research is based on the category-irrelevant object discovery, which are hard to discover specific set of categories without explicit supervised signal. To solve this problem, this paper proposes a novel Object-Centric Learning (OCL) framework, built upon pretrained Vision Transformer (ViT) model, to learn a set of latent representations of specified objects. These representations are gradually refined by slot-attention mechanism, which allows the model to further differentiate the representations of different categories of objects. A background completion self-supervised training task is further proposed to improve the generalization ability of model in real-world scenarios. Experimental results demonstrate that our OCL achieves state-of-the-art performance (77.29%, 29.88% and 57.96%) on skin cancer database, PH2 database and UAV-BD dataset.
What problem does this paper attempt to address?