Abstract:Dahan da Cunha Nascimento, 1, &ast Nicholas Rolnick, 2, &ast Isabella da Silva Almeida, 3, &ast Gerson Cipriano Junior, 4, &ast João Luiz Durigan 3, &ast 1 Physical Education Department, Universidade Católica de Brasília, Brasília, DF, Brazil; 2 The Human Performance Mechanic Department, Lehman College, Bronx, NY, USA; 3 Laboratory of Muscle and Tendon Plasticity, Faculdade de Ceilândia, Universidade de Brasília, Brasília, DF, Brazil; 4 Graduate Program in Rehabilitation Science, Faculdade de Ceilândia, Universidade de Brasília, Brasília, DF, Brazil &astThese authors contributed equally to this work Correspondence: João Luiz Durigan, Laboratory of Muscle and Tendon Plasticity, Faculdade de Ceilândia, Universidade de Brasília, Centro Metropolitano, conjunto A, lote 01, Brasília, DF, 72220-275, Brazil, Tel/Fax +55 (61) 3376-0252, Email Null hypothesis significant testing (NHST) is the dominant statistical approach in the geriatric and rehabilitation fields. However, NHST is routinely misunderstood or misused. In this case, the findings from clinical trials would be taken as evidence of no effect, when in fact, a clinically relevant question may have a "non-significant" p -value. Conversely, findings are considered clinically relevant when significant differences are observed between groups. To assume that p -value is not an exclusive indicator of an association or the existence of an effect, researchers should be encouraged to report other statistical analysis approaches as Bayesian analysis and complementary statistical tools alongside the p -value (eg, effect size, confidence intervals, minimal clinically important difference, and magnitude-based inference) to improve interpretation of the findings of clinical trials by presenting a more efficient and comprehensive analysis. However, the focus on Bayesian analysis and secondary statistical analyses does not mean that NHST is less important. Only that, to observe a real intervention effect, researchers should use a combination of secondary statistical analyses in conjunction with NHST or Bayesian statistical analysis to reveal what p -values cannot show in the geriatric and rehabilitation studies ( eg, the clinical importance of 1kg increase in handgrip strength in the intervention group of long-lived older adults compared to a control group). This paper provides potential insights for improving the interpretation of scientific data in rehabilitation and geriatric fields by utilizing Bayesian and secondary statistical analyses to better scrutinize the results of clinical trials where a p -value alone may not be appropriate to determine the efficacy of an intervention. Keywords: statistics, statistical significance, effect size, p-value Statistical analyses are fundamental to clinical trials in the geriatric and rehabilitation fields, and it is important for researchers to identify whether the data is clinically important and objectively able to determine differences between groups. These analytical skills are essential for uncovering trends and assessing the efficacy of an intervention. Researchers routinely select and evaluate data using conventional approaches, focusing on mean responses between groups. However, although it is specifically addressed in only a comparatively small number of studies, inter-individual variability in response to an intervention is also expected. 1,2 Furthermore, statistical significance testing (represented by the p -value) can be routinely misunderstood or misused as it is not a measure of effect size nor provides evidence of no effect, 3,4 leading to challenges in interpretation of clinical trials. For a didactical purpose, a Fisher's p -value was created to calculate the probability of an event and evaluate this probability within the research context. 5,6 Thus, the reader frequently encounters a statistical test followed by a probability statement, such as p ≤ 0.05; the researcher accepts the null hypothesis if an event occurs more often than 5% (eg, 0.051). However, if an event occurs 5% of the time or less (eg, 0.05), the null hypothesis is rejected in favor of the alternative. Furthermore, based on a reasonable significance level to make a sound statistical decision, the researcher also wants to be wrong if the null hypothesis is incorrectly rejected. 7 Hence, a type I error occurs when the null hypothesis is rejected when it is true. 7 The researcher concluded that a statistic reflected a real difference when it was a sampling error. 7 This probability is called alpha level, or α, the level or significance prev -Abstract Truncated-

On Some Assumptions of the Null Hypothesis Statistical Testing

My Ban on Null Hypothesis Significance Testing and Confidence Intervals

Confidence distributions and hypothesis testing

Connecting Simple and Precise P-values to Complex and Ambiguous Realities

Bayesian Hypothesis Tests Using Nonparametric Statistics

P values, confidence intervals, or confidence levels for hypotheses?

Confidences in Hypotheses

A Likelihood-based Alternative to Null Hypothesis Significance Testing

The effects of violations of assumptions underlying the t test.

Why bother with Bayesian t-tests?

Bayesian Data Analysis: A Fresh Approach to Power Issues and Null Hypothesis Interpretation

Towards a theory for testing statistical hypothesis: Multivariate mean with nuisance covariance matrix

Frequentist, Bayesian Analysis and Complementary Statistical Tools for Geriatric and Rehabilitation Fields: Are Traditional Null-Hypothesis Significance Testing Methods Sufficient?

Testing the null hypothesis: The forgotten legacy of Karl Popper?

P-value: A Bless or A Curse for Evidence-Based Studies?

Post-hoc Hypothesis Testing

[The enormous difference between not rejecting a null hypothesis and stating that it is true]

On the Equivalence between Bayesian and Classical Hypothesis Testing

Null hypothesis significance tests: A mix-up of two different theories, the basis for widespread confusion and numerous misinterpretations

Valid p-Values and Expectations of p-Values Revisited

Comparing researchers' degree of dichotomous thinking using frequentist versus Bayesian null hypothesis testing