Abstract:Reference-free evaluation metrics for generative models have recently been studied in the machine learning community. As a reference-free metric, the VENDI score quantifies the diversity of generative models using matrix-based entropy from information theory. The VENDI score is usually computed through the eigendecomposition of an $n \times n$ kernel matrix for $n$ generated samples. However, due to the high computational cost of eigendecomposition for large $n$, the score is often computed on sample sizes limited to a few tens of thousands. In this paper, we explore the statistical convergence of the VENDI score and demonstrate that for kernel functions with an infinite feature map dimension, the evaluated score for a limited sample size may not converge to the matrix-based entropy statistic. We introduce an alternative statistic called the $t$-truncated VENDI statistic. We show that the existing Nyström method and the FKEA approximation method for the VENDI score will both converge to the defined truncated VENDI statistic given a moderate sample size. We perform several numerical experiments to illustrate the concentration of the empirical VENDI score around the truncated VENDI statistic and discuss how this statistic correlates with the visual diversity of image data.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper mainly explores the statistical convergence of the **VENDI (Variety and Energy - based Novelty Diversity Index) score**, especially the computational complexity problems encountered when dealing with large - scale generative models. Specifically, the paper aims to solve the following problems: 1. **Statistical convergence with a finite sample size**: - The VENDI score is usually used to evaluate the diversity of generative models, but its calculation depends on the eigenvalue decomposition of the kernel matrix, and this process becomes very expensive when the sample size is large (more than tens of thousands). - The paper studies whether a finite sample size can accurately estimate the true value of the VENDI score (i.e., the limit value when the sample size approaches infinity) under different types of kernel functions (finite - dimensional and infinite - dimensional). 2. **Introduction of the truncated VENDI statistic**: - For infinite - dimensional kernel functions (such as the Gaussian kernel), the paper finds that a finite sample size is not sufficient to ensure the convergence of the VENDI score. Therefore, the author introduces a new statistic - the **t - truncated VENDI statistic**, which is calculated only using the first t largest eigenvalues of the kernel covariance matrix. - The author proves that under a finite sample size, the t - truncated VENDI statistic can be effectively estimated, and existing approximation methods (such as the Nyström method and the FKEA method) can well approximate this truncated statistic. 3. **Effectiveness of existing approximation methods**: - The performance of existing techniques such as the Nyström method and the FKEA method in estimating the VENDI score is studied, and it is verified whether these methods can effectively estimate the t - truncated VENDI statistic. - Numerical experiments show that under a finite sample size, these approximation methods can provide results close to the t - truncated VENDI statistic, thus providing a feasible alternative for practical applications. ### Summary The core problem of the paper is: how to ensure the statistical convergence of the VENDI score in the case of a finite sample size, especially when dealing with infinite - dimensional kernel functions. To this end, the author proposes the t - truncated VENDI statistic and verifies its effectiveness and feasibility through theoretical analysis and numerical experiments. This provides a new perspective and tool for evaluating the diversity of large - scale generative models.

On the Statistical Complexity of Estimating VENDI Scores from Empirical Data

Towards a Scalable Reference-Free Evaluation of Generative Models

The Vendi Score: A Diversity Evaluation Metric for Machine Learning

Conditional Vendi Score: An Information-Theoretic Approach to Diversity Evaluation of Prompt-based Generative Models

Cousins Of The Vendi Score: A Family Of Similarity-Based Diversity Metrics For Science And Machine Learning

Effectively Unbiased FID and Inception Score and where to find them

A Study on the Evaluation of Generative Models

An Interpretable Evaluation of Entropy-based Novelty of Generative Models

Estimators of Entropy and Information via Inference in Probabilistic Models

The Representation Jensen-Shannon Divergence

On the Distributed Evaluation of Generative Models

Rethinking FID: Towards a Better Evaluation Metric for Image Generation

No-reference image quality assessment through the von Mises distribution

A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models

Feature Likelihood Divergence: Evaluating the Generalization of Generative Models Using Samples

Convergence of the Inexact Langevin Algorithm and Score-based Generative Models in KL Divergence

Analyzing Generative Models by Manifold Entropic Metrics

Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance

VNE: An Effective Method for Improving Deep Representation by Manipulating Eigenvalue Distribution

Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design

Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality