Abstract:Unsupervised Anomalous Sound Detection (ASD) aims to design a generalizable method that can be used to detect anomalies when only normal sounds are given. In this paper, Anomalous Sound Detection based on Diffusion Models (ASD-Diffusion) is proposed for ASD in real-world factories. In our pipeline, the anomalies in acoustic features are reconstructed from their noisy corrupted features into their approximate normal pattern. Secondly, a post-processing anomalies filter algorithm is proposed to detect anomalies that exhibit significant deviation from the original input after reconstruction. Furthermore, denoising diffusion implicit model is introduced to accelerate the inference speed by a longer sampling interval of the denoising process. The proposed method is innovative in the application of diffusion models as a new scheme. Experimental results on the development set of DCASE 2023 challenge task 2 outperform the baseline by 7.75%, demonstrating the effectiveness of the proposed method.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to detect anomalous sounds (Anomalous Sound Detection, ASD) in industrial scenarios when only normal sound data are available. Specifically, the research objective is to design a general method to detect anomalous sounds during machine operation without tuning the model's hyper - parameters. This is known as "first - shot" anomalous sound detection (first - shot ASD), that is, using only normal sound data for training and being able to detect unseen anomalous sound patterns during testing. ### Main Problems and Challenges 1. **Lack of Labeled Data**: In practical applications, due to the diversity of operating conditions and the atypical nature of abnormal situations, it is very difficult to collect sound data that fully covers abnormal patterns. 2. **First - shot Detection**: It is required to detect anomalous sounds when only normal sound data are available, without adjusting the model's hyper - parameters for each type of machine. 3. **High - Dimensional Time - Frequency Information Processing**: Audio signals contain complex high - dimensional time - frequency information, and how to effectively represent and process this information is a challenge. ### Solutions The paper proposes an anomalous sound detection method based on the diffusion model (ASD - Diffusion), and the main innovations include: 1. **Application of the Diffusion Model**: For the first time, the diffusion model is applied to the field of anomalous sound detection. The diffusion model learns the distribution of normal sounds by gradually adding noise and reconstructing clean sound features. 2. **Post - processing Anomaly Filtering Algorithm**: A post - processing anomaly filtering (AF) algorithm is proposed to detect significant deviations between reconstructed samples and original samples in order to locate abnormal regions. 3. **Accelerating Inference Speed**: The denoising diffusion implicit model (DDIM) is introduced to accelerate the inference process by increasing the sampling interval while maintaining good detection performance. ### Method Overview - **Forward Diffusion Process**: Gradually add noise to normal sound features to generate noisy features. - **Reverse Denoising Process**: Predict the noise in the noisy features by training a neural network and gradually remove the noise to reconstruct sound features close to normal. - **Anomaly Detection**: Detect anomalous sounds by comparing the differences (such as mean - square error or absolute error) between the original sample and the reconstructed sample. - **Post - processing**: Use the AF algorithm to further filter out abnormal regions and improve detection accuracy. ### Experimental Results The experimental results show that ASD - Diffusion performs excellently in Task 2 of the DCASE 2023 Challenge, with a 7.75% performance improvement compared to the baseline method, especially outstanding in the target domain (target domain), even when only a small amount of normal audio data in the target domain is provided. ### Conclusion The paper demonstrates the effectiveness and potential of the diffusion model in anomalous sound detection, especially in the first - shot scenario. Future work will further explore unsupervised methods and provide better anomaly - locating capabilities.

ASD-Diffusion: Anomalous Sound Detection with Diffusion Models

Unveiling the Spatial-Temporal Dynamics: Diffusion-based Learning of Conditional Distribution for Range-Dependent Ocean Sound Speed Field Forecasting

A Diffusion-Based Framework for Multi-Class Anomaly Detection

DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection

First-Shot Unsupervised Anomalous Sound Detection With Unknown Anomalies Estimated by Metadata-Assisted Audio Generation

Anomaly detection using Diffusion-based methods

Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Using Classification-Based Methods

DiffusionAD: Denoising Diffusion for Anomaly Detection

Anomaly sound detection of industrial devices by using teacher-student incremental continual learning

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Self-supervised enhanced denoising diffusion for anomaly detection

Dynamic Addition of Noise in a Diffusion Model for Anomaly Detection

Noisy-ArcMix: Additive Noisy Angular Margin Loss Combined With Mixup Anomalous Sound Detection

GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection

Unsupervised industrial anomaly detection with diffusion models

Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models

Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring

Stream-based Active Learning for Anomalous Sound Detection in Machine Condition Monitoring

ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection

On Diffusion Modeling for Anomaly Detection

Diffusion Model for DAS-VSP Data Denoising