Test-Time Generative Augmentation for Medical Image Segmentation

Xiao Ma,Yuhui Tao,Yuhan Zhang,Zexuan Ji,Yizhe Zhang,Qiang Chen
2024-06-25
Abstract:In this paper, we propose a novel approach to enhance medical image segmentation during test time. Instead of employing hand-crafted transforms or functions on the input test image to create multiple views for test-time augmentation, we advocate for the utilization of an advanced domain-fine-tuned generative model (GM), e.g., stable diffusion (SD), for test-time augmentation. Given that the GM has been trained to comprehend and encapsulate comprehensive domain data knowledge, it is superior than segmentation models in terms of representing the data characteristics and distribution. Hence, by integrating the GM into test-time augmentation, we can effectively generate multiple views of a given test sample, aligning with the content and appearance characteristics of the sample and the related local data distribution. This approach renders the augmentation process more adaptable and resilient compared to conventional handcrafted transforms. Comprehensive experiments conducted across three medical image segmentation tasks (nine datasets) demonstrate the efficacy and versatility of the proposed TTGA in enhancing segmentation outcomes. Moreover, TTGA significantly improves pixel-wise error estimation, thereby facilitating the deployment of a more reliable segmentation system. Code will be released at: <a class="link-external link-https" href="https://github.com/maxiao0234/TTGA" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in medical image segmentation tasks, how to improve the segmentation accuracy and the accuracy of pixel - level error estimation through test - time augmentation (TTA). Specifically, the author proposes a test - time generative augmentation method (TTGA) based on generative models, aiming to use advanced generative models (such as the Stable Diffusion model) to generate multiple new samples related to the original test samples but with different appearances during the test phase, thereby improving the segmentation results and providing more reliable error estimates. ### Main problems and solutions in the paper 1. **Problem background**: - Deep learning has achieved remarkable success in the field of medical image segmentation, but most research focuses on model architecture design, data utilization, and training methods, with less attention paid to improvements during testing. - Test - time augmentation (TTA) and test - time model adaptation are two common test - time improvement methods, but they rely on predefined transformations or task - specific functions and lack flexibility and adaptability. 2. **Proposed methods**: - **TTGA**: Use generative models (such as the Stable Diffusion model) to generate multiple new samples during testing. These samples have the same content as the original test samples but different appearances. By combining the segmentation results of these generated samples, better segmentation quality and more accurate error estimates can be obtained. - **Masked Null - Text Inversion**: A new diffusion inversion method is proposed, which makes local modifications to randomly generated masked areas during the denoising process to maintain the consistency of non - edited areas. 3. **Experimental verification**: - The author conducted extensive experiments on three medical image segmentation tasks (optic disc and cup segmentation, polyp segmentation, and skin lesion segmentation) to verify the effectiveness and superiority of TTGA. - The experimental results show that TTGA not only improves the segmentation accuracy but also significantly improves the pixel - level error estimation, making the segmentation system more reliable. ### Formula summary - **Diffusion Probability Model (DPM)**: \[ q(x_t|x_{t - 1}) := \mathcal{N}(x_t; \sqrt{\alpha_t}x_{t - 1}, (1 - \alpha_t)I) \] \[ p_\theta(x_{t - 1}|x_t) := \mathcal{N}(x_{t - 1}; \mu_\theta(x_t, t), \Sigma_\theta) \] - **Classifier - free Guidance**: \[ \tilde{\epsilon}_t=\epsilon_\theta(x_t, t, \emptyset)+\omega\cdot(\epsilon_\theta(x_t, t, c)-\epsilon_\theta(x_t, t, \emptyset)) \] - **Uncertainty Estimation (Entropy)**: \[ H(p)=\sum_{k = 1}^K p_k\cdot\log_2(p_k) \] ### Conclusion By introducing generative models for test - time augmentation, the TTGA method can significantly improve the accuracy and reliability of medical image segmentation without changing the model parameters, especially performing well in error estimation. This provides a new, flexible, and effective test - time augmentation strategy for medical image segmentation.