Towards a power analysis for PLS-based methods

Angela Andreella,Livio Fino,Bruno Scarpa,Matteo Stocchero
2024-03-15
Abstract:In recent years, power analysis has become widely used in applied sciences, with the increasing importance of the replicability issue. When distribution-free methods, such as Partial Least Squares (PLS)-based approaches, are considered, formulating power analysis turns out to be challenging. In this study, we introduce the methodological framework of a new procedure for performing power analysis when PLS-based methods are used. Data are simulated by the Monte Carlo method, assuming the null hypothesis of no effect is false and exploiting the latent structure estimated by PLS in the pilot data. In this way, the complex correlation data structure is explicitly considered in power analysis and sample size estimation. The paper offers insights into selecting statistical tests for the power analysis procedure, comparing accuracy-based tests and those based on continuous parameters estimated by PLS. Simulated and real datasets are investigated to show how the method works in practice.
Methodology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to conduct power analysis when using non - parametric methods such as Partial Least Squares (PLS) for statistical analysis. Specifically, the author proposes a new methodological framework for performing power analysis in the PLS method. This method generates data through Monte Carlo simulation, assumes that the null hypothesis is false, and utilizes the latent structure estimated by PLS from pilot data. In this way, the influence of complex correlated data structures on power analysis and sample size estimation can be explicitly considered. The main contributions of the paper include: 1. **Proposal of a new method**: A new procedure for conducting power analysis based on PLS is proposed, which can handle the complex correlation structures of multivariate data. 2. **Application of simulated and real - data**: Demonstrate the practical application effects of this method through simulated data and real - data sets. 3. **Selection of statistical tests**: Explore the selection of statistical test methods used in the power analysis process, and compare the accuracy - based test and the test based on continuous parameters estimated by PLS. 4. **Software implementation**: Develop an R package to ensure the transparency and reproducibility of the research results. In summary, this paper aims to provide a systematic method to help researchers conduct power analysis and sample size estimation more effectively when using the PLS method, thereby improving the reliability and reproducibility of scientific research.