Abstract:This paper proposes a novel large-dimensional positive definite covariance estimator for high-frequency data under a general factor model framework. We demonstrate an appealing connection between the proposed estimator and a weighted group least absolute shrinkage and selection operator (LASSO) penalized least-squares estimator. The proposed estimator improves on traditional principal component analysis by allowing for weak factors, whose signal strengths are weak relative to idiosyncratic components. Despite the presence of microstructure noises and asynchronous trading, the proposed estimator achieves guarded positive definiteness without sacrificing the convergence rate. To make our method fully operational, we provide an extended simultaneous alternating direction method of multipliers algorithm to solve the resultant constrained convex minimization problem efficiently. Empirically, we study the monthly high-frequency covariance structure of the stock constituents of the S&P 500 index from 2008 to 2016, using all traded stocks from the NYSE, AMEX, and NASDAQ stock markets to construct the high-frequency Fama-French four and extended eleven economic factors. We further examine the out-of-sample performance of the proposed method through vast portfolio allocations, which deliver significantly reduced out-of-sample portfolio risk and enhanced Sharpe ratios. The success of our method supports the usefulness of machine learning techniques in finance. This paper was accepted by Agostino Capponi, finance. Funding: This work was supported by the Research Grants Council, University Grants Committee [Grants 11500119, 11505522, 11505721, and 21504818] and the National Natural Science Foundation of China (NSFC) Basic Scientific Center Project [Grant 71988101], entitled as “Econometric Modelling and Economic Policy Studies”, as well as NSFC [Grants 71803166 and 72173104]. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2022.04138 .

Towards a sparse, scalable, and stably positive definite (inverse) covariance estimator

Large-Dimensional Positive Definite Covariance Estimation for High Frequency Data via Low-rank and Sparse Matrix Decomposition

A convex framework for high-dimensional sparse Cholesky based covariance estimation

Sparse estimation of a covariance matrix

Estimation of Sparse Covariance Matrix Via Non-Convex Regularization

Positive Definite Estimation of Large Covariance Matrix Using Generalized Nonconvex Penalties.

A Regularized High-Dimensional Positive Definite Covariance Estimator with High-Frequency Data

Sparse permutation invariant covariance estimation

Fast and Positive Definite Estimation of Large Covariance Matrix for High-Dimensional Data Analysis

Nonparametric estimation of large covariance matrices with conditional sparsity

A Non-Parametric Shrinkage Mean Estimation for Arbitrary Quadratic Loss Functions and Unknown Covariance Matrices

Sparse inverse covariance estimation with the graphical lasso

Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation

A Coordinate-wise Optimization Algorithm for Sparse Inverse Covariance Selection.

Optimal Eigenvalue Shrinkage in the Semicircle Limit

Non-parametric Shrinkage Mean Estimation for Quadratic Loss Functions with Unknown Covariance Matrices

Sparse and Low-Rank Covariance Matrix Estimation

A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation

Covariance Structure Estimation with Laplace Approximation

Large Covariance Estimation by Thresholding Principal Orthogonal Complements

Optimal Rates of Convergence for Sparse Covariance Matrix Estimation