How statistical model development can obscure inequities in STEM student outcomes

Ben Van Dusen,Jayson Nissen
DOI: https://doi.org/10.48550/arXiv.2111.07869
2021-11-13
Abstract:Researchers often frame quantitative research as objective, but every step in data collection and analysis can bias findings in often unexamined ways. In this investigation, we examined how the process of selecting variables to include in regression models (model specification) can bias findings about inequities in science and math student outcomes. We identified the four most used methods for model specification in discipline-based education research about equity: a priori, statistical significance, variance explained, and information criterion. Using a quantitative critical perspective that blends statistical theory with critical theory, we reanalyzed the data from a prior publication using each of the four methods and compared the findings from each. We concluded that using information criterion produced models that best aligned with our quantitative critical perspective's emphasis on intersectionality and models with more accurate coefficients and uncertainties. Based on these findings, we recommend researchers use information criterion for specifying models about inequities in STEM student outcomes.
Physics Education
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how the statistical model development process may mask inequalities in STEM (science, technology, engineering, and mathematics) student outcome research. Specifically, the author explores how different methods affect research findings on unequal outcomes for STEM students when selecting variables in regression models (i.e., model specification). By re - analyzing previously published data and using four of the most common model specification methods - a priori, statistical significance, explained variance, and information criteria - the paper evaluates the effectiveness of these methods in describing and explaining inequalities in STEM education. Ultimately, the paper aims to recommend a model specification method that can better reflect the intersectional identities of different student groups to improve the accuracy of model coefficients and uncertainties.