MLRMPA: an R Package of Multiple Linear Regression Model Population Analysis Based on a Cluster Sampling Technique for Variable Selection of High Dimensional Data

Meihong Xie,Fangfang Deng,Xiaoyun Zhang,Yueli Tian,Peizhen Li,Honglin Zhai
DOI: https://doi.org/10.1016/j.chemolab.2014.01.010
IF: 4.175
2014-01-01
Chemometrics and Intelligent Laboratory Systems
Abstract:We develop an R package MLRMPA for fitting a pool of models between response variable and chemical descriptors. It is an embedded method combining feature selection with model building. The feature selection procedure is a cluster sampling method and different from model population analysis (MPA) that was implemented in a previously published study. The modeling process performs multiple stepwise regression analysis using the sampled features from the clustered group. This paper provides the algorithm and method implemented in the R package, which includes VarCor feature selection, cluster sampling, model building and model checking. This package is applied to establish an optimal linear model to predict the response and detect outliers from sub-optimal models.
What problem does this paper attempt to address?