Temporal Analysis and Gender Bias in Computing

Thomas J. Misa
DOI: https://doi.org/10.48550/arXiv.2210.08983
2022-09-29
Computers and Society
Abstract:Recent studies of gender bias in computing use large datasets involving automatic predictions of gender to analyze computing publications, conferences, and other key populations. Gender bias is partly defined by software-driven algorithmic analysis, but widely used gender prediction tools can result in unacknowledged gender bias when used for historical research. Many names change ascribed gender over decades: the "Leslie problem." Systematic analysis of the Social Security Administration dataset -- each year, all given names, identified by ascribed gender and frequency of use -- in 1900, 1925, 1950, 1975, and 2000 permits a rigorous assessment of the "Leslie problem." This article identifies 300 given names with measurable "gender shifts" across 1925-1975, spotlighting the 50 given names with the largest such shifts. This article demonstrates, quantitatively, there is net "female shift" that likely results in the overcounting of women (and undercounting of men) in earlier decades, just as computer science was professionalizing. Some aspects of the widely accepted 'making programming masculine' perspective may need revision.
What problem does this paper attempt to address?