Robust Covariance Matrix Estimation for High-Dimensional Compositional Data with Application to Sales Data Analysis

Danning Li,Arun Srinivasan,Qian Chen,Lingzhou Xue
DOI: https://doi.org/10.1080/07350015.2022.2106990
2022-09-21
Journal of Business and Economic Statistics
Abstract:Compositional data arises in a wide variety of research areas when some form of standardization and composition is necessary. Estimating covariance matrices is of fundamental importance for high-dimensional compositional data analysis. However, existing methods require the restrictive Gaussian or sub-Gaussian assumption, which may not hold in practice. We propose a robust composition adjusted thresholding covariance procedure based on Huber-type M -estimation to estimate the sparse covariance structure of high-dimensional compositional data. We introduce a cross-validation procedure to choose the tuning parameters of the proposed method. Theoretically, by assuming a bounded fourth moment condition, we obtain the rates of convergence and signal recovery property for the proposed method and provide the theoretical guarantees for the cross-validation procedure under the high-dimensional setting. Numerically, we demonstrate the effectiveness of the proposed method in simulation studies and also a real application to sales data analysis.
statistics & probability,social sciences, mathematical methods,economics
What problem does this paper attempt to address?