Diffusion based Semantic Outlier Generation via Nuisance Awareness for Out-of-Distribution Detection

Suhee Yoon,Sanghyu Yoon,Hankook Lee,Ye Seul Sim,Sungik Choi,Kyungeun Lee,Hye-Seung Cho,Woohyung Lim
2024-08-27
Abstract:Out-of-distribution (OOD) detection, which determines whether a given sample is part of the in-distribution (ID), has recently shown promising results through training with synthetic OOD datasets. Nonetheless, existing methods often produce outliers that are considerably distant from the ID, showing limited efficacy for capturing subtle distinctions between ID and OOD. To address these issues, we propose a novel framework, Semantic Outlier generation via Nuisance Awareness (SONA), which notably produces challenging outliers by directly leveraging pixel-space ID samples through diffusion models. Our approach incorporates SONA guidance, providing separate control over semantic and nuisance regions of ID samples. Thereby, the generated outliers achieve two crucial properties: (i) they present explicit semantic-discrepant information, while (ii) maintaining various levels of nuisance resemblance with ID. Furthermore, the improved OOD detector training with SONA outliers facilitates learning with a focus on semantic distinctions. Extensive experiments demonstrate the effectiveness of our framework, achieving an impressive AUROC of 88% on near-OOD datasets, which surpasses the performance of baseline methods by a significant margin of approximately 6%.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the Near - Out - of - Distribution (Near - OOD) detection task, the abnormal samples generated by existing methods differ too much from the in - original - distribution samples, and cannot effectively capture subtle semantic differences. This leads to poor performance of the model when dealing with near - out - of - distribution samples. Specifically, the existing methods based on synthesizing abnormal samples have the following problems: 1. **The generated abnormal samples are too far from the original distribution**: The abnormal samples generated by these methods often have a large gap from the in - original - distribution samples, and it is difficult to capture subtle semantic differences. 2. **High dependence on the generation target**: Existing methods are highly sensitive to the generation target (such as text prompts or blurry images), resulting in unstable performance. 3. **Difficult to capture the subtle semantic differences of near - out - of - distribution samples**: Especially when dealing with near - out - of - distribution samples with similar backgrounds or structures to the in - original - distribution samples, existing models are easily confused. To solve these problems, the paper proposes a new framework - **Semantic Outlier Generation via Nuisance Awareness (SONA)**, which aims to generate more challenging abnormal samples by directly using the in - original - distribution samples in the pixel space. These abnormal samples not only contain clear semantic difference information, but also can maintain similarity with the original samples in noise characteristics, thereby improving the performance of the model in the near - out - of - distribution detection task. ### Main contributions of SONA: 1. **Generate more challenging abnormal samples**: By directly using the in - original - distribution samples in the pixel space, generate abnormal samples that both contain semantic differences and retain noise characteristics. 2. **Fine - grained control of semantic and noise areas**: Use the diffusion model to perform fine - grained control of semantic and noise areas, ensuring that the generated abnormal samples can reflect semantic differences and maintain similarity in noise characteristics. 3. **Significantly improve near - out - of - distribution detection performance**: Experimental results show that the SONA framework has achieved performance significantly better than the baseline methods in the near - out - of - distribution detection task. Especially on the ImageNet dataset, the AUROC has reached 88.5%. Through these improvements, the SONA framework can more effectively help the model capture the subtle semantic differences in near - out - of - distribution samples, thereby improving its detection performance.