Abstract:BackgroundIn biomedical research, response variables are often encountered which have bounded support on the open unit interval - (0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models.MethodsIn the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided.ResultsIf response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the response data are generated from a discrete multinomial distribution with support on (0,1).ConclusionsThe linear regression model, the variable-dispersion beta regression model and the fractional logit regression model all perform well across the simulation experiments under consideration. When employing beta regression to estimate covariate effects on (0,1) response data, researchers should ensure their dispersion sub-model is properly specified, else inferential errors could arise.

A Simulation Study for Propensity Score in Dealing with Col- linearity Data

Propensity score analysis for time-dependent exposure

Comparison of Propensity Score Methods Under Different Confounding Structures:A Simulation Study

Propensity Score Method: a Non-Parametric Technique to Reduce Model Dependence

Propensity Score Weighting with Missing Data on Covariates and Clustered Data Structure

Comparing methods addressing multi-collinearity when developing prediction models

Deep Learning-based Propensity Scores for Confounding Control in Comparative Effectiveness Research A Large-scale, Real-world Data Study

[A Monte-Carlo Study for Propensity Score Methods].

Observational studies using propensity score analysis underestimated the effect sizes in critical care medicine.

Comparing Propensity Score-Based Methods in Estimating the Treatment Effects: A Simulation Study

Modified Liu Parameters for Scaling Options of the Multiple Regression Model with Multicollinearity Problem

Handling multicollinearity in quantile regression through the use of principal component regression

Regression analysis of proportional data using simplex distribution

Propensity score analysis with latent covariates: Measurement error bias correction using the covariate's posterior mean, aka the inclusive factor score

A simple sensitivity analysis method for unmeasured confounders via linear programming with estimating equation constraints

A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design

[An Effective Method to Reduce Bias Between Two Compared Groups: Propensity Score].

Collaborative-controlled LASSO for Constructing Propensity Score-based Estimators in High-Dimensional Data

Robust Estimating Method for Propensity Score Models and its Application to Some Causal Estimands: A review and proposal

Comparison of Different Methods for Analyzing Data Obtained from Stratified Cluster Random Sampling

Overcoming the problems caused by collinearity in mixed-effects logistic model: determining the contribution of various types of violence on depression in pregnant women