Abstract:Anomaly detection has recently gained increasing attention in the field of computer vision, likely due to its broad set of applications ranging from product fault detection on industrial production lines and impending event detection in video surveillance to finding lesions in medical scans. Regardless of the domain, anomaly detection is typically framed as a one-class classification task, where the learning is conducted on normal examples only. An entire family of successful anomaly detection methods is based on learning to reconstruct masked normal inputs (e.g. patches, future frames, etc.) and exerting the magnitude of the reconstruction error as an indicator for the abnormality level. Unlike other reconstruction-based methods, we present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level. The proposed self-supervised block is extremely flexible, enabling information masking at any layer of a neural network and being compatible with a wide range of neural architectures. In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss. Furthermore, we show that our block is applicable to a wider variety of tasks, adding anomaly detection in medical images and thermal videos to the previously considered tasks based on RGB images and surveillance videos. We exhibit the generality and flexibility of SSMCTB by integrating it into multiple state-of-the-art neural models for anomaly detection, bringing forth empirical results that confirm considerable performance improvements on five benchmarks. We release our code and data as open source at: <a class="link-external link-https" href="https://github.com/ristea/ssmctb" rel="external noopener nofollow">this https URL</a>.

Mask2Anomaly: Mask Transformer for Universal Open-set Segmentation

Unmasking Anomalies in Road-Scene Segmentation

Maskomaly:Zero-Shot Mask Anomaly Segmentation

When Masked Image Modeling Meets Source-free Unsupervised Domain Adaptation: Dual-Level Masked Network for Semantic Segmentation

Masked-attention Mask Transformer for Universal Image Segmentation

Masked Transformer for image Anomaly Localization

Anomaly-Aware Semantic Segmentation by Leveraging Synthetic-Unknown Data

Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection

IterMask2: Iterative Unsupervised Anomaly Segmentation via Spatial and Frequency Masking for Brain Lesions in MRI

OpenMask3D: Open-Vocabulary 3D Instance Segmentation

A Novel MAE-Based Self-Supervised Anomaly Detection and Localization Method

XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation

Maskformer with Improved Encoder-Decoder Module for Semantic Segmentation of Fine-Resolution Remote Sensing Images.

Mask DINO: Towards a Unified Transformer-Based Framework for Object Detection and Segmentation

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

Mean Shift Mask Transformer for Unseen Object Instance Segmentation

Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation

Where are the Masks: Instance Segmentation with Image-level Supervision

Self-Supervised Anomaly Detection from Anomalous Training Data via Iterative Latent Token Masking