Bulk Spectra of Truncated Sample Covariance Matrices

Subhroshekhar Ghosh,Soumendu Sundar Mukherjee,Himasish Talukdar
2024-09-05
Abstract:Determinantal Point Processes (DPPs), which originate from quantum and statistical physics, are known for modelling diversity. Recent research [Ghosh and Rigollet (2020)] has demonstrated that certain matrix-valued $U$-statistics (that are truncated versions of the usual sample covariance matrix) can effectively estimate parameters in the context of Gaussian DPPs and enhance dimension reduction techniques, outperforming standard methods like PCA in clustering applications. This paper explores the spectral properties of these matrix-valued $U$-statistics in the null setting of an isotropic design. These matrices may be represented as $X L X^\top$, where $X$ is a data matrix and $L$ is the Laplacian matrix of a random geometric graph associated to $X$. The main mathematically interesting twist here is that the matrix $L$ is dependent on $X$. We give complete descriptions of the bulk spectra of these matrix-valued $U$-statistics in terms of the Stieltjes transforms of their empirical spectral measures. The results and the techniques are in fact able to address a broader class of kernelised random matrices, connecting their limiting spectra to generalised Marčenko-Pastur laws and free probability.
Statistics Theory,Probability
What problem does this paper attempt to address?