Abstract:This paper proposes a novel large-dimensional positive definite covariance estimator for high-frequency data under a general factor model framework. We demonstrate an appealing connection between the proposed estimator and a weighted group least absolute shrinkage and selection operator (LASSO) penalized least-squares estimator. The proposed estimator improves on traditional principal component analysis by allowing for weak factors, whose signal strengths are weak relative to idiosyncratic components. Despite the presence of microstructure noises and asynchronous trading, the proposed estimator achieves guarded positive definiteness without sacrificing the convergence rate. To make our method fully operational, we provide an extended simultaneous alternating direction method of multipliers algorithm to solve the resultant constrained convex minimization problem efficiently. Empirically, we study the monthly high-frequency covariance structure of the stock constituents of the S&P 500 index from 2008 to 2016, using all traded stocks from the NYSE, AMEX, and NASDAQ stock markets to construct the high-frequency Fama-French four and extended eleven economic factors. We further examine the out-of-sample performance of the proposed method through vast portfolio allocations, which deliver significantly reduced out-of-sample portfolio risk and enhanced Sharpe ratios. The success of our method supports the usefulness of machine learning techniques in finance. This paper was accepted by Agostino Capponi, finance. Funding: This work was supported by the Research Grants Council, University Grants Committee [Grants 11500119, 11505522, 11505721, and 21504818] and the National Natural Science Foundation of China (NSFC) Basic Scientific Center Project [Grant 71988101], entitled as “Econometric Modelling and Economic Policy Studies”, as well as NSFC [Grants 71803166 and 72173104]. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2022.04138 .

Assessing multivariate predictors of financial market movements: A latent factor framework for ordinal data

Estimating Latent Asset-Pricing Factors

Dynamic Factor Models for Multivariate Count Data: An Application to Stock-Market Trading Activity

Latent Factor Analysis in Short Panels

A Regularized High-Dimensional Positive Definite Covariance Estimator with High-Frequency Data

Factor Models for Portfolio Selection in Large Dimensions: The Good, the Better and the Ugly

Large-dimensional factor modeling based on high-frequency observations

Eigenvalue tests for the number of latent factors in short panels

Factors That Fit the Time Series and Cross-Section of Stock Returns

Multivariate ordinal regression for multiple repeated measurements

Discriminative conditional restricted Boltzmann machine for discrete choice and latent variable modelling

Huber Principal Component Analysis for Large-dimensional Factor Models

Robust Nearly-Efficient Estimation of Large Panels with Factor Structures

Generalized dynamic factor models and volatilities: recovering the market volatility shocks

LLMFactor: Extracting Profitable Factors through Prompts for Explainable Stock Movement Prediction

State-Varying Factor Models of Large Dimensions

Testing Covariates in High Dimension Linear Regression with Latent Factors

On estimating covariances between many assets with histories of highly variable length

One Factor to Bind the Cross-Section of Returns

Path and Direction Discovery in Individual Dynamic Factor Models: A Regularized Hybrid Unified Structural Equation Modeling with Latent Variable

Robust Multiple Testing under High-dimensional Dynamic Factor Model