Abstract:Adaptive lasso penalized generalized linear models (GLMs) are a powerful tool for analyzing the high-dimensional sparse data where the classical linear or normal assumption is not met. In non-distributed environments, the estimation problem of adaptive lasso penalized GLMs is often solved by the coordinate descent based algorithm developed in Friedman, Hastie, and Tibshirani (<a href="#">2010</a> Friedman, J., T. Hastie, and R. Tibshirani. 2010. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software 33 (1):1–22. doi:10.18637/jss.v033.i01.<a href="/servlet/linkout?suffix=CIT0008&dbid=16&doi=10.1080%2F03610918.2021.1888998&key=10.18637%2Fjss.v033.i01">[Crossref]</a>, <a href="/servlet/linkout?suffix=CIT0008&dbid=8&doi=10.1080%2F03610918.2021.1888998&key=20808728">[PubMed]</a>, <a href="/servlet/linkout?suffix=CIT0008&dbid=128&doi=10.1080%2F03610918.2021.1888998&key=000275203200001">[Web of Science ®]</a> , <a class="google-scholar" href="http://scholar.google.com/scholar_lookup?hl=en&volume=33&publication_year=2010&pages=1-22&issue=1&author=J.+Friedman&author=T.+Hastie&author=R.+Tibshirani&title=Regularization+Paths+for+Generalized+Linear+Models+via+Coordinate+Descent&doi=10.18637%2Fjss.v033.i01">[Google Scholar]</a>), which has been well implemented in the R package glmnet. However, when applied to distributed big data, this algorithm is usually inflexible or even infeasible due to its non-parallel implementation, especially when the communication costs between the central and local machines are expensive, or the storage and computing capabilities of the central machine are insufficient. In this paper, we propose a new method, QAGLM-alasso, for the adaptive lasso penalized GLMs problem in distributed big data by applying the quadratic approximation representation of GLMs, and further develop a path-following algorithm for its estimation based on the Least Angle Regression (LARS). Theoretical analyses show that, under mild regularity conditions, the QAGLM-alasso enjoys the oracle property, and the obtained estimator is asymptotically equivalent to the original adaptive lasso. Simulation studies demonstrate that the new algorithm has similar estimation accuracy with glmnet, but is significantly faster than glmnet in distributed environments. We further illustrate the practical performance of the proposed method by analyzing a supersymmetric (SUSY) benchmark data set.

A Communication-Efficient Parallel Method for Group-Lasso.

Distributed Bootstrap Simultaneous Inference for High-Dimensional Quantile Regression

An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems

A Novel Differentially Private Online Learning Algorithm for Group Lasso in Big Data

A Fast and Scalable Pathwise-Solver for Group Lasso and Elastic Net Penalized Regression via Block-Coordinate Descent

High Performance LDA Through Collective Model Communication Optimization

Network Lasso: Clustering and Optimization in Large Graphs

Median Selection Subset Aggregation for Parallel Inference

Distributed adaptive lasso penalized generalized linear models for big data

Smoothing composite proximal gradient algorithm for sparse group Lasso problems with nonsmooth loss functions

Efficient sparse Hessian based algorithms for the clustered lasso problem.

A Proximal Point Algorithm For Log-Determinant Optimization With Group Lasso Regularization

CPGL: A Classification Method Combining PCA and the Group Lasso Method

An Accurate and Efficient Large-scale Regression Method through Best Friend Clustering

Dual feature reduction for the sparse-group lasso and its adaptive variant

Efficient Generalized Fused Lasso and Its Applications.

A unified consensus-based parallel ADMM algorithm for high-dimensional regression with combined regularizations

Consistent group selection in high-dimensional linear regression

A Parallel Approach to Link Sign Prediction in Large-Scale Online Social Networks.

Gaussian Graphical Models parallel estimation via coordinate descent neighborhood selection

A Sparse-Group Lasso