Image Generation Diversity Issues and How to Tame Them

Mischa Dombrowski,Weitong Zhang,Sarah Cechnicka,Hadrien Reynaud,Bernhard Kainz
2024-11-25
Abstract:Generative methods now produce outputs nearly indistinguishable from real data but often fail to fully capture the data distribution. Unlike quality issues, diversity limitations in generative models are hard to detect visually, requiring specific metrics for assessment. In this paper, we draw attention to the current lack of diversity in generative models and the inability of common metrics to measure this. We achieve this by framing diversity as an image retrieval problem, where we measure how many real images can be retrieved using synthetic data as queries. This yields the Image Retrieval Score (IRS), an interpretable, hyperparameter-free metric that quantifies the diversity of a generative model's output. IRS requires only a subset of synthetic samples and provides a statistical measure of confidence. Our experiments indicate that current feature extractors commonly used in generative model assessment are inadequate for evaluating diversity effectively. Consequently, we perform an extensive search for the best feature extractors to assess diversity. Evaluation reveals that current diffusion models converge to limited subsets of the real distribution, with no current state-of-the-art models superpassing 77% of the diversity of the training data. To address this limitation, we introduce Diversity-Aware Diffusion Models (DiADM), a novel approach that improves diversity of unconditional diffusion models without loss of image quality. We do this by disentangling diversity from image quality by using a diversity aware module that uses pseudo-unconditional features as input. We provide a Python package offering unified feature extraction and metric computation to further facilitate the evaluation of generative models <a class="link-external link-https" href="https://github.com/MischaD/beyondfid" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the lack of diversity in generative models. Specifically, although current generative methods can produce outputs that are almost indistinguishable from real data, they often fail to fully capture the diversity of the data distribution. Unlike quality - related problems, the diversity limitations in generative models are difficult to detect visually and require specific metrics for evaluation. Therefore, this paper focuses on the lack of diversity in current generative models and the inability of existing metrics to effectively measure this problem, and proposes a new metric - the Image Retrieval Score (IRS) - to quantify the diversity of generative model outputs. ### Main contributions: 1. **Reveal the limitations of existing feature extractors**: The paper points out that the feature extractors currently used to evaluate generative models perform poorly on actual datasets and cannot effectively measure diversity. 2. **Propose the IRS metric**: IRS evaluates whether synthetic data can effectively retrieve real images by framing the diversity problem as an image retrieval problem, providing a parameter - free and statistically reliable metric. 3. **Introduce the Diversity - Aware Diffusion Model (DiADM)**: In order to improve the diversity of unconditional diffusion models without sacrificing image quality, the paper proposes a new method - DiADM. This method enhances the diversity of generative models by using pseudo - unconditional features as input to separate diversity and image quality. ### Method overview: - **Image retrieval problem**: Transform the problem of evaluating the diversity of generative models into an image retrieval problem, and measure diversity by calculating the number of real images that can be retrieved by synthetic images. - **IRS calculation**: Define IRS as a metric for the diversity of generative models, and evaluate diversity by calculating how many training images the generated synthetic images can correspond to. - **Adjust the measurement gap**: In order to eliminate the measurement gap caused by the poor performance of feature extractors on real data, the paper proposes an adjustment step to correct the IRS of synthetic data by normalizing the IRS of real data. ### Experimental results: - **Feature extractor selection**: Experiments show that existing feature extractors have significant measurement gaps in measuring diversity, and the paper solves this problem through an adjustment step. - **Effectiveness of DiADM**: The experimental results show that DiADM can significantly improve the diversity of generative models without sacrificing image quality. ### Conclusion: By proposing IRS and DiADM, this paper provides new ideas and tools for the diversity evaluation and improvement of generative models, which helps to promote the balanced development of generative models between diversity and quality.