Abstract:Ratio statistics and distributions play a crucial role in various fields, including linear regression, metrology, nuclear physics, operations research, econometrics, biostatistics, genetics, and engineering. In this work, we examine the statistical properties and probability calculations of the Hake normalized gain as a measure of effect size and educational effectiveness in physics education. Leveraging existing knowledge about the Hake ratio as a ratio of normal variables and utilizing open data science tools, we developed two novel computational approaches for computing ratio distributions. Our pilot numerical study demonstrates the speed, accuracy, and reliability of calculating ratio distributions through (1) DE quadrature with/without barycentric interpolation, a very quick and efficient quadrature method, and (2) a 2D vectorized numerical inversion of characteristic functions, which offers broader applicability by not requiring knowledge of PDFs or the independence of ratio constituents. These numerical explorations not only deepen the understanding of the Hake ratio's distribution but also showcase the efficiency, precision, and versatility of our proposed methods, making them highly suitable for fast data analysis based on exact probability ratio distributions. This capability has potential applications in multidimensional statistics and uncertainty analysis in metrology, where precise and reliable data handling is essential.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to explore the probability distribution and calculation methods of Hake normalized gain as an effect size (ES) statistic. Specifically, the authors focus on the following aspects: 1. **Statistical characteristics**: Study the statistical characteristics of Hake normalized gain, especially its nature as a ratio statistic. Hake normalized gain is an effect - size indicator widely used in physics education to evaluate the effectiveness of teaching interventions. 2. **Probability calculation methods**: Develop and evaluate two new calculation methods for dealing with ratio distributions: - **Double Exponential Quadrature**: Combined with barycentric interpolation, this is a fast and efficient numerical integration method. - **Two - dimensional vectorized numerical inversion of the characteristic function method**: This method does not require prior knowledge of the probability density function (PDF) or the independence of ratio components and has broader applicability. 3. **Application background**: These methods not only deepen the understanding of the Hake ratio distribution but also demonstrate their efficiency, accuracy, and versatility in uncertainty analysis in multi - dimensional statistics and metrology. 4. **Practical applications**: Through these methods, rapid data analysis based on accurate probability ratio distributions can be achieved, which is of great significance for fields that require accurate and reliable data processing, such as uncertainty analysis in multi - dimensional statistics and metrology. ### Formula representation The formulas involved in the paper are represented in Markdown format as follows: - The definition of Hake normalized gain is: \[ g=\frac{\mu_{\text{post}}-\mu_{\text{pre}}}{100 - \mu_{\text{post}}}, \quad \hat{g}=\frac{S_{\text{post}}-S_{\text{pre}}}{100 - S_{\text{pre}}} \] - The definition of Cohen's d statistic is: \[ d = \frac{\mu_{\text{post}}-\mu_{\text{pre}}}{\sigma}, \quad \hat{d}=\frac{S_{\text{post}}-S_{\text{pre}}}{s} \] - The probability density function (PDF) of the Hake ratio was proposed by Hinkley (1969): \[ f_W(w)=b(w)d(w)\sqrt{\frac{2\pi}{\sigma_1\sigma_2a^3(w)}}\left[\Phi\left(\frac{b(w)}{\sqrt{(1 - \rho^2)a(w)}}\right)-\Phi\left(-\frac{b(w)}{\sqrt{(1 - \rho^2)a(w)}}\right)\right]+\frac{\sqrt{1-\rho^2}}{\pi\sigma_1\sigma_2a^2(w)}\exp\left(-\frac{c}{2(1-\rho^2)}\right) \] where: \[ a(w)=\left(\frac{w^2}{\sigma_1^2}-\frac{2\rho w}{\sigma_1\sigma_2}+\frac{1}{\sigma_2^2}\right)^{1/2}, \quad b(w)=\frac{\mu_1w}{\sigma_1^2}-\frac{\rho(\mu_1+\mu_2w)}{\sigma_1\sigma_2}+\frac{\mu_2}{\sigma_2^2} \]

Probability distributions and calculations for Hake's ratio statistics in measuring effect size

Improved Ratio Estimators Using Some Robust Measures

An unbiased method of measuring the ratio of two data sets

A NEW PERSPECTIVE ON ROBUST M-ESTIMATION: FINITE SAMPLE THEORY AND APPLICATIONS TO DEPENDENCE-ADJUSTED MULTIPLE TESTING

On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data

A unified scheme to solving arbitrary complex-valued ratio distribution with application to statistical inference for raw frequency response functions and transmissibility functions

Estimating Ratios of Normalizing Constants Using Linked Importance Sampling

Converting sWeights to Probabilities with Density Ratios

Estimating the number and effect sizes of non-null hypotheses

Interactive Visualization and Computation of 2D and 3D Probability Distributions

Inferential procedures based on the weighted Pearson correlation coefficient test statistic

Data Unfolding with Mean Integrated Square Error Optimization

Preprocessing of centred logratio transformed density functions using smoothing splines

Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression

A Density-ratio Framework for Statistical Data Processing.

Ratios of H-Cores, H-Tails and Uncited Sources in Sets of Scientific Papers and Technical Patents.

Asymptotic Distribution and Simultaneous Confidence Bands for Ratios of Quantile Functions

Hilbert Curve Projection Distance for Distribution Comparison

An Entropy-Based Approach for Nonparametrically Testing Simple Probability Distribution Hypotheses

Hierarchical Integral Probability Metrics: A distance on random probability measures with low sample complexity

Histogram lies about distribution shape and Pearson's coefficient of variation lies about variability