Abstract:We propose a novel nonparametric kernel-based estimator of cross-sectional conditional mean and covariance matrices for large unbalanced panels. We show its consistency and provide finite-sample guarantees. In an empirical application, we estimate conditional mean and covariance matrices for a large unbalanced panel of monthly stock excess returns given macroeconomic and firm-specific covariates from 1962 to <a class="link-external link-http" href="http://2021.The" rel="external noopener nofollow">this http URL</a> estimator performs well with respect to statistical measures. It is informative for empirical asset pricing, generating conditional mean-variance efficient portfolios with substantial out-of-sample Sharpe ratios far beyond equal-weighted benchmarks.
What problem does this paper attempt to address?
### What problem does this paper attempt to solve?
This paper aims to solve the problem of estimating the conditional mean and covariance matrix for large - scale unbalanced panel data in financial economics. Specifically, the author proposes a novel non - parametric kernel method for estimating the cross - sectional conditional mean and covariance matrix in large - scale unbalanced panel data. This method not only solves the statistical inference problems of existing methods when dealing with high - dimensional unbalanced data, but also provides finite - sample guarantees and demonstrates its application in empirical asset pricing.
#### Main problem background:
1. **Unbalanced panel data**: Financial data are usually unbalanced, that is, the number of assets observed in different time periods may be different. This poses a challenge to statistical inference.
2. **High - dimensional data**: As the number of assets increases, the data dimension becomes very high, making it difficult for traditional linear models and parametric methods to handle effectively.
3. **Joint estimation of conditional mean and covariance matrix**: Most of the existing methods focus either on the covariance matrix or on the conditional mean, but few methods can effectively estimate both simultaneously.
#### Main contributions of the paper:
1. **Non - parametric kernel method**: A non - parametric method based on kernel functions is proposed, which can estimate the conditional mean and covariance matrix simultaneously.
2. **Consistency and finite - sample guarantees**: The consistency of the estimator is proved and the performance guarantees in the finite - sample case are provided.
3. **Empirical application**: By using the monthly excess return data of US stocks from 1962 to 2021, the effectiveness of this method is verified. The results show that the conditional mean - variance efficient portfolio generated by this method has a significant Sharpe ratio, far exceeding the equal - weight benchmark.
#### Formula summary:
- Conditional mean and covariance matrix:
\[
E_t[x_{t + 1,i}]=\mu(z_{t,i}),\quad E_t[x_{t + 1,i}x_{t + 1,j}]=q(z_{t,i},z_{t,j})
\]
where
\[
\text{Cov}_t[x_{t + 1,i},x_{t + 1,j}]=q(z_{t,i},z_{t,j})-\mu(z_{t,i})\mu(z_{t,j})
\]
- Definition of kernel function \(q\):
\[
q(z,z') = q_{sy}(z,z')+q_{id}(z,z')
\]
where the systematic component \(q_{sy}(z,z')\) and the idiosyncratic component \(q_{id}(z,z')\) are respectively expressed as:
\[
q_{sy}(z,z')=\langle h_{sy}(z),h_{sy}(z')\rangle_C,\quad q_{id}(z,z')=\|h_{id}(z)\|_C^21_{z = z'}
\]
- Regularized loss function:
\[
R(h,\xi_t)=L(h,\xi_t)+\lambda_{sy}\|h_{sy}\|^2_{H_{sy}}+\lambda_{id}\|h_{id}\|^2_{H_{id}}
\]
where \(L(h,\xi_t)\) is the loss function, and \(\lambda_{sy}\) and \(\lambda_{id}\) are regularization parameters.
Through these formulas and methods, the paper provides an effective framework for dealing with the problem of estimating the conditional mean and covariance matrix in large - scale unbalanced panel data.