Abstract:Abstract Background For finite samples with binary outcomes penalized logistic regression such as ridge logistic regression has the potential of achieving smaller mean squared errors (MSE) of coefficients and predictions than maximum likelihood estimation. There is evidence, however, that ridge logistic regression can result in highly variable calibration slopes in small or sparse data situations. Methods In this paper, we elaborate this issue further by performing a comprehensive simulation study, investigating the performance of ridge logistic regression in terms of coefficients and predictions and comparing it to Firth’s correction that has been shown to perform well in low-dimensional settings. In addition to tuned ridge regression where the penalty strength is estimated from the data by minimizing some measure of the out-of-sample prediction error or information criterion, we also considered ridge regression with pre-specified degree of shrinkage. We included ‘oracle’ models in the simulation study in which the complexity parameter was chosen based on the true event probabilities (prediction oracle) or regression coefficients (explanation oracle) to demonstrate the capability of ridge regression if truth was known. Results Performance of ridge regression strongly depends on the choice of complexity parameter. As shown in our simulation and illustrated by a data example, values optimized in small or sparse datasets are negatively correlated with optimal values and suffer from substantial variability which translates into large MSE of coefficients and large variability of calibration slopes. In contrast, in our simulations pre-specifying the degree of shrinkage prior to fitting led to accurate coefficients and predictions even in non-ideal settings such as encountered in the context of rare outcomes or sparse predictors. Conclusions Applying tuned ridge regression in small or sparse datasets is problematic as it results in unstable coefficients and predictions. In contrast, determining the degree of shrinkage according to some meaningful prior assumptions about true effects has the potential to reduce bias and stabilize the estimates.

A semi-automatic method to guide the choice of ridge parameter in ridge regression

Dimension free ridge regression

To tune or not to tune, a case study of ridge logistic regression in small or sparse datasets

High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking

Optimal Subsampling for Large Sample Ridge Regression

Fast cross-validation for multi-penalty ridge regression

Better prediction by use of co-data: Adaptive group-regularized ridge regression

Broken Adaptive Ridge Method for Variable Selection in Generalized Partly Linear Models with Application to the Coronary Artery Disease Data

Post Selection Shrinkage Estimation for High Dimensional Data Analysis

Penalized Variable Selection with Broken Adaptive Ridge Regression for Semi-competing Risks Data

Cross-trait prediction accuracy of high-dimensional ridge-type estimators in genome-wide association studies

Gradient-based bilevel optimization for multi-penalty Ridge regression through matrix differential calculus

Ridge regression with adaptive additive rectangles and other piecewise functional templates

Shrinkage-based regularization tests for high-dimensional data with application to gene set analysis

Prevalidated ridge regression is a highly-efficient drop-in replacement for logistic regression for high-dimensional data

Adaptive Ridge-Penalized Functional Local Linear Regression

g.ridge: An R Package for Generalized Ridge Regression for Sparse and High-Dimensional Linear Models

Prediction modelling with many correlated and zero-inflated predictors: assessing a nonnegative garrote approach

Long term follow-up of "full metal jacket" of de novo coronary lesions with new generation Zotarolimus-eluting stents.

Adaptive Ridge Selector (ARiS)

High-Dimensional Regression and Variable Selection Using CAR Scores