Sparse inference in Poisson Log-Normal model by approximating the L0-norm
Togo Jean Yves Kioye,Paul-Marie Grollemund,Jocelyn Chauvet,Pierre Druilhet,Erwan Saint-Loubert-Bie,Christophe Chassard
2024-03-26
Abstract:Variable selection methods are required in practical statistical modeling, to
identify and include only the most relevant predictors, and then improving
model interpretability. Such variable selection methods are typically employed
in regression models, for instance in this article for the Poisson Log Normal
model (PLN, Chiquet et al., 2021). This model aim to explain multivariate count
data using dependent variables, and its utility was demonstrating in scientific
fields such as ecology and agronomy. In the case of the PLN model, most recent
papers focus on sparse networks inference through combination of the likelihood
with a L1 -penalty on the precision matrix. In this paper, we propose to rely
on a recent penalization method (SIC, O'Neill and Burke, 2023), which consists
in smoothly approximating the L0-penalty, and that avoids the calibration of a
tuning parameter with a cross-validation procedure. Moreover, this work focuses
on the coefficient matrix of the PLN model and establishes an inference
procedure ensuring effective variable selection performance, so that the
resulting fitted model explaining multivariate count data using only relevant
explanatory variables. Our proposal involves implementing a procedure that
integrates the SIC penalization algorithm (epsilon-telescoping) and the PLN
model fitting algorithm (a variational EM algorithm). To support our proposal,
we provide theoretical results and insights about the penalization method, and
we perform simulation studies to assess the method, which is also applied on
real datasets.
Methodology