T. Hoshi,R. Takayama,Y. Iguchi,T. Fujiwara
Abstract:Several methods are constructed for large-scale electronic structure calculations. Test calculations are carried out with up to 10^7 atoms. As an application, cleavage process of silicon is investigated by molecular dynamics simulation with 10-nm-scale systems. As well as the elementary formation process of the (111)-(2 x 1) surface, we obtain nanoscale defects, that is, step formation and bending of cleavage path into favorite (experimentally observed) planes. These results are consistent to experiments. Moreover, the simulation result predicts an explicit step structure on the cleaved surface, which shows a bias-dependent STM image.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to conduct sufficient dimension reduction (SDR) in regression analysis, especially to propose a new method based on contour directions to estimate the central subspace. Specifically, the paper proposes two methods, namely Simple Contour Regression (SCR) and General Contour Regression (GCR). These methods achieve dimension reduction by estimating the directions in which the response variable changes less. Compared with the existing SDR techniques, the contour regression methods can ensure the complete estimation of the central subspace under the assumption of elliptical distribution, and have the advantages of √n - consistency (that is, the convergence rate of the estimator is 1/√n) and simple calculation. In addition, the contour regression methods also show good robustness in cases where the distribution deviates from the elliptical distribution.
### Main contributions of the paper:
1. **Propose new dimension reduction methods**: Introduce dimension reduction methods based on contour directions, including SCR and GCR. These methods achieve dimension reduction by estimating the directions in which the response variable changes less.
2. **Theoretical properties**: Prove that under the assumption of elliptical distribution, SCR and GCR can completely estimate the central subspace and have √n - consistency.
3. **Robustness**: Demonstrate the robustness of the contour regression methods when the distribution of predictor variables deviates from the elliptical distribution.
4. **Performance comparison**: Compare the performance of SCR and GCR with other commonly used SDR methods (such as ordinary least squares, sliced inverse regression, principal Hessian directions, and sliced average variance estimation) through simulation experiments, and verify the advantages of the new methods.
5. **Practical application**: Demonstrate the practical application of the contour regression methods through a data set on soil evaporation.
### Key concepts:
- **Central Subspace**: Refers to the subspace that contains all linear combinations of predictor variables \(X\) that are conditionally independent of the response variable \(Y\).
- **Contour Directions**: Refer to the directions in which the response variable changes less, and these directions form the orthogonal complement space of the central subspace.
- **√n - Consistency**: Means that the convergence rate of the estimator is 1/√n, which is an important property in statistical estimation, indicating that as the sample size increases, the error of the estimator will decrease at a rate of 1/√n.
### Mathematical formulas:
- **Definition of the central subspace**:
\[
Y \perp \perp X \mid \eta^T X
\]
where \(\eta\) is a \(p\times d\) matrix, \(d\leq p\), and \(\eta^T X\) is a linear combination of \(X\).
- **Contour direction matrix \(K(c)\)**:
\[
K(c) = \mathbb{E}[(\tilde{X} - X)(\tilde{X} - X)^T \mid |\tilde{Y} - Y| \leq c]
\]
where \((\tilde{X}, \tilde{Y})\) is an independent copy of \((X, Y)\), and \(c\) is a threshold parameter.
- **Assumption 2.1**:
\[
\text{var}[w^T(\tilde{X} - X) \mid |\tilde{Y} - Y| \leq c] > \text{var}[v^T(\tilde{X} - X) \mid |\tilde{Y} - Y| \leq c]
\]
where \(v\in S_{Y|X}\) and \(w\in (S_{Y|X})^\perp\), and \(\|v\|=\|w\| = 1\).
### Conclusion:
The paper proposes based on...