Simultaneous variable selection and estimation in semiparametric regression of mixed panel count data

Lei Ge,Tao Hu,Yang Li
DOI: https://doi.org/10.1093/biomtc/ujad041
IF: 1.701
2024-01-29
Biometrics
Abstract:Abstract Mixed panel count data represent a common complex data structure in longitudinal survey studies. A major challenge in analyzing such data is variable selection and estimation while efficiently incorporating both the panel count and panel binary data components. Analyses in the medical literature have often ignored the panel binary component and treated it as missing with the unknown panel counts, while obviously such a simplification does not effectively utilize the original data information. In this research, we put forward a penalized likelihood variable selection and estimation procedure under the proportional mean model. A computationally efficient EM algorithm is developed that ensures sparse estimation for variable selection, and the resulting estimator is shown to have the desirable oracle property. Simulation studies assessed and confirmed the good finite-sample properties of the proposed method, and the method is applied to analyze a motivating dataset from the Health and Retirement Study.
statistics & probability,mathematical & computational biology,biology
What problem does this paper attempt to address?