Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection

Neelu Madan,Nicolae-Catalin Ristea,Radu Tudor Ionescu,Kamal Nasrollahi,Fahad Shahbaz Khan,Thomas B. Moeslund,Mubarak Shah
DOI: https://doi.org/10.1109/TPAMI.2023.3322604
2023-10-05
Abstract:Anomaly detection has recently gained increasing attention in the field of computer vision, likely due to its broad set of applications ranging from product fault detection on industrial production lines and impending event detection in video surveillance to finding lesions in medical scans. Regardless of the domain, anomaly detection is typically framed as a one-class classification task, where the learning is conducted on normal examples only. An entire family of successful anomaly detection methods is based on learning to reconstruct masked normal inputs (e.g. patches, future frames, etc.) and exerting the magnitude of the reconstruction error as an indicator for the abnormality level. Unlike other reconstruction-based methods, we present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level. The proposed self-supervised block is extremely flexible, enabling information masking at any layer of a neural network and being compatible with a wide range of neural architectures. In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss. Furthermore, we show that our block is applicable to a wider variety of tasks, adding anomaly detection in medical images and thermal videos to the previously considered tasks based on RGB images and surveillance videos. We exhibit the generality and flexibility of SSMCTB by integrating it into multiple state-of-the-art neural models for anomaly detection, bringing forth empirical results that confirm considerable performance improvements on five benchmarks. We release our code and data as open source at: <a class="link-external link-https" href="https://github.com/ristea/ssmctb" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper is primarily dedicated to addressing the problem of anomaly detection in the field of computer vision. Specifically, the research team proposes a method called Self-Supervised Masked Convolutional Transformer Block (SSMCTB), which aims to learn how to reconstruct masked information through self-supervision, thereby serving as a means for anomaly detection. The main contributions of SSMCTB include: 1. **Proposing Masked Convolution Operation**: This method applies a mask to the central region of the convolutional kernel, requiring the network to rely on the surrounding visible information to reconstruct the masked part. This helps the network learn how to recover missing or occluded content based on contextual information. 2. **Integration into Neural Networks**: SSMCTB can be embedded as an independent module into various existing neural network architectures, including convolution-based and transformer-based architectures, to enhance their performance in anomaly detection tasks. 3. **Extension to 3D Convolution**: In addition to standard 2D masked convolution, the research also extends to 3D masked convolution, enabling SSMCTB to be applied to 3D inputs such as video data. 4. **Adoption of Multi-Head Self-Attention Mechanism**: Compared to previous work, SSMCTB uses a more powerful multi-head self-attention module instead of a simple channel attention module, thereby enhancing the model's learning capability. 5. **Improved Loss Function**: The use of Huber loss instead of Mean Squared Error (MSE) loss improves robustness to outliers. The experimental section demonstrates the effectiveness of SSMCTB on multiple benchmark datasets, including image and video data, covering various fields such as industry and medicine. The results show that integrating SSMCTB into existing state-of-the-art models can significantly improve anomaly detection performance. In summary, this paper effectively addresses key challenges in the field of anomaly detection by proposing a novel and flexible self-supervised learning method—SSMCTB, and demonstrates its broad application potential.