Denoising Diffusion Probabilistic Models for Generation of Realistic Fully-Annotated Microscopy Image Data Sets

Dennis Eschweiler,Rüveyda Yilmaz,Matisse Baumann,Ina Laube,Rijo Roy,Abin Jose,Daniel Brückner,Johannes Stegmaier
2023-08-08
Abstract:Recent advances in computer vision have led to significant progress in the generation of realistic image data, with denoising diffusion probabilistic models proving to be a particularly effective method. In this study, we demonstrate that diffusion models can effectively generate fully-annotated microscopy image data sets through an unsupervised and intuitive approach, using rough sketches of desired structures as the starting point. The proposed pipeline helps to reduce the reliance on manual annotations when training deep learning-based segmentation approaches and enables the segmentation of diverse datasets without the need for human annotations. This approach holds great promise in streamlining the data generation process and enabling a more efficient and scalable training of segmentation models, as we show in the example of different practical experiments involving various organisms and cell types.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of generating realistic, fully annotated microscopy image datasets to reduce the reliance on manually labeled data when training deep learning-based segmentation models. Specifically, the authors propose a method to automatically generate these datasets using Denoising Diffusion Probabilistic Models (DDPM). This approach allows for the creation of diverse datasets without extensive manual annotation work, thereby improving the efficiency and scalability of the data generation process. ### Main Issues and Solutions: 1. **Issue**: Existing segmentation methods typically require a large amount of manually labeled data, which is both time-consuming and expensive, limiting the widespread adoption of deep learning methods in practical applications. 2. **Solution**: The authors propose a DDPM-based pipeline that automatically generates realistic, fully annotated microscopy image datasets using rough structural sketches as a starting point. This method not only reduces the dependence on manual annotation but also generates high-quality image data for training segmentation models. ### Specific Method: - **Forward Process**: Gradually add noise to real microscopy images until reaching a pure noise state. - **Backward Process**: Train a neural network to progressively reverse the forward process, generating realistic image data from pure noise. - **Optimization**: Adjust parameters \( t_{\text{start}} \) and Gaussian smoothing standard deviation \( \sigma \) to ensure the generated data maintains high realism and structural relevance. ### Experimental Validation: - **Dataset**: Experiments were conducted using publicly available 3D microscopy image datasets to evaluate the realism and segmentation performance of the generated data. - **Evaluation Metrics**: Peak Signal-to-Noise Ratio (PSNR) and Zero-Mean Normalized Cross-Correlation (ZNCC) were used to assess the quality of the generated data. - **Segmentation Performance**: The Cellpose model was trained using the generated data for segmentation and compared with models trained using a large amount of manually labeled data. The results showed that the generated data could achieve similar segmentation performance. ### Conclusion: The proposed method excels in generating realistic, fully annotated microscopy image datasets, significantly reducing the need for manually labeled data and improving the efficiency and scalability of data generation. This opens up new possibilities for applying deep learning methods in biomedical image analysis.