The "given data" paradigm undermines both cultures

Tyler McCormick
DOI: https://doi.org/10.48550/arXiv.2105.12478
IF: 5.414
2021-05-26
Machine Learning
Abstract:Breiman organizes "Statistical modeling: The two cultures" around a simple visual. Data, to the far right, are compelled into a "black box" with an arrow and then catapulted left by a second arrow, having been transformed into an output. Breiman then posits two interpretations of this visual as encapsulating a distinction between two cultures in statistics. The divide, he argues is about what happens in the "black box." In this comment, I argue for a broader perspective on statistics and, in doing so, elevate questions from "before" and "after" the box as fruitful areas for statistical innovation and practice.
What problem does this paper attempt to address?