Abstract:Partial least squares (PLS) regression has achieved desirable performance for modeling the relationship between a set of dependent (response) variables with another set of independent (predictor) variables, especially when the sample size is small relative to the dimension of these variables. In each iteration, PLS finds two latent variables from a set of dependent and independent variables via maximizing the product of three factors: variances of the two latent variables as well as the square of the correlation between these two latent variables. In this paper, we derived the mathematical formulation of the relationship between mean square error (MSE) and these three factors. We find that MSE is not monotonous with the product of the three factors. However, the corresponding optimization problem is difficult to solve if we extract the optimal latent variables directly based on this relationship. To address these problems, a novel multilinear regression model-variance constrained partial least squares (VCPLS) is proposed. In the proposed VCPLS, we find the latent variables via maximizing the product of the variance of latent variable from dependent variables and the square of the correlation between the two latent variables, while constraining the variance of the latent variable from independent variables must be larger than a predetermined threshold. The corresponding optimization problem can be solved computational efficiently, and the latent variables extracted by VCPLS are near-optimal. Compared with classical PLS and it is variants, VCPLS can achieve lower prediction error in the sense of MSE. The experiments are conducted on three near-infrared spectroscopy (NIR) data sets. To demonstrate the applicability of our proposed VCPLS, we also conducted experiments on another data set, which has different characteristics from NIR data. Experimental results verified the superiority of our proposed VCPLS.

A test of significance for partial least squares regression

Maximum Likelihood Estimators in a Two Step Model for PLS

Model selection for partial least squares regression

Prediction-Oriented Model Selection In Partial Least Squares Path Modeling

Boosting the Partial Least Square Algorithm for Regression Modelling

Comparison of variable selection methods in partial least squares regression

The Elephant in the Room: Evaluating the Predictive Performance of Partial Least Squares (PLS) Path Models

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

Predictive Model Selection in Partial Least Squares Path Modeling (PLS-PM)

Robust methods for partial least squares regression

Partial least squares regression as an alternative to current regression methods used in ecology

Some theoretical aspects of partial least squares regression

An adaptive strategy to improve the partial least squares model via minimum covariance determinant

Determining the Number of Components in PLS Regression on Incomplete Data

The Elephant in the Room: Predictive Performance of PLS Models

Partial least trimmed squares regression

Variance constrained partial least squares

Evaluating permutation-based inference for partial least squares analysis of neuroimaging data

Towards a power analysis for PLS-based methods

A non-asymptotic analysis of the single component PLS regression