OCFormer: One-Class Transformer Network for Image Classification

Prerana Mukherjee,Chandan Kumar Roy,Swalpa Kumar Roy
DOI: https://doi.org/10.48550/arXiv.2204.11449
2022-04-25
Abstract:We propose a novel deep learning framework based on Vision Transformers (ViT) for one-class classification. The core idea is to use zero-centered Gaussian noise as a pseudo-negative class for latent space representation and then train the network using the optimal loss function. In prior works, there have been tremendous efforts to learn a good representation using varieties of loss functions, which ensures both discriminative and compact properties. The proposed one-class Vision Transformer (OCFormer) is exhaustively experimented on CIFAR-10, CIFAR-100, Fashion-MNIST and CelebA eyeglasses datasets. Our method has shown significant improvements over competing CNN based one-class classifier approaches.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?