Abstract:Introduction: Logistic regression models are frequently used to estimate measures of association between an exposure, health determinant or intervention, and a binary outcome. However, when the outcome is frequent (> 10%), model estimates for relative risks and prevalence ratios might be biased. Despite the availability of several alternatives, many still rely on these models, and a consensus is yet to be reached. We aimed to compare the estimation and goodness-of-fit of logistic, log-binomial and robust Poisson regression models, in cross-sectional studies involving frequent binary outcomes. Methods: Two cross-sectional studies were conducted. Study 1 was a nationally representative study on the impact of air pollution on mental health. Study 2 was a local study on immigrants' access to urgent healthcare services. Odds ratios (OR) were obtained through logistic regression, and prevalence ratios (PR) through log-binomial and robust Poisson regression models. Confidence intervals (CI), their ranges, and standard-errors (SE) were also computed, along with models' relative goodness-of-fit through Akaike Information Criterion (AIC), when applicable. Results: In Study 1, the OR (95% CI) was 1.015 (0.970 - 1.063), while the PR (95% CI) obtained through the robust Poisson mode was 1.012 (0.979 - 1.045). The log-binomial regression model did not converge in this study. In Study 2, the OR (95% CI) was 1.584 (1.026 - 2.446), the PR (95% CI) for the log-binomial model was 1.217 (0.978 - 1.515), and 1.130 (1.013 - 1.261) for the robust Poisson model. The 95% CI, their ranges, and the SE of the OR were higher than those of the PR, in both studies. However, in Study 2, the AIC value was lower for the logistic regression model. Conclusion: The odds ratio overestimated PR with wider 95% CI and higher SE. The overestimation was greater as the outcome of the study became more prevalent, in line with previous studies. In Study 2, the logistic regression was the model with the best fit, illustrating the need to consider multiple criteria when selecting the most appropriate statistical model for each study. Employing logistic regression models by default might lead to misinterpretations. Robust Poisson models are viable alternatives in cross-sectional studies with frequent binary outcomes, avoiding the non-convergence of log-binomial models.

Assessing the Lognormal Distribution Assumption For the Crude Odds Ratio: Implications For Point and Interval Estimation

Modified Poisson Regression Model for Data of Prospective Studies with Common Outcomes

Two-tailed asymptotic inferences for the odds ratio in cross-sectional studies: evaluation of fifteen old and new methods of inference

Generalized Confidence Intervals for Ratios of Standard Deviations Based on Log-Normal Distribution when Times Follow Weibull Distributions

Combining estimates of the odds ratio: the state of the art

What'S the Relative Risk? A Method of Correcting the Odds Ratio in Cohort Studies of Common Outcomes

Estimation of Relative Risk Using a Log-Binomial Model with Constraints

Odds Ratios are far from "portable": A call to use realistic models for effect variation in meta-analysis

Corrected Correlation Estimates for Meta-Analysis

The weighted log-rank tests based on stratified clustered survival data: saddle-point p-values and confidence intervals

Point and interval estimation of exposure effects and interaction between the exposures based on logistic model for observational studies

Sensitivity and specificity of normality tests and consequences on reference interval accuracy at small sample size: a computer-simulation study

A Discriminant Function Approach to Adjust for Processing and Measurement Error When a Biomarker is Assayed in Pooled Samples

P-values and confidence intervals for weighted log-rank tests under truncated binomial design based on clustered medical data

Some Results about Standardization for a Non Confounder in Estimators of (log) Relative Risk

Logistic Regression: Limitations in the Estimation of Measures of Association with Binary Health Outcomes

Odds are the sign is right

Estimating an adjusted risk difference in a cluster randomized trial with individual-level analyses

An efficient asymptotic approach for testing monotone proportions assuming an underlying logit based order dose-response model

A comparison of Bayesian and score methods for interval estimates of positive/negative likelihood ratios in support of diagnostic device performance evaluation

A Nonparametric Measure of Local Association for two-way Contingency Tables