Three Gaps in Computational Text Analysis Methods for Social Sciences: A Research Agenda

Christian Baden,Christian Pipal,Martijn Schoonvelde,Mariken A. C. G van der Velden
DOI: https://doi.org/10.1080/19312458.2021.2015574
IF: 8.044
2021-12-27
Communication Methods and Measures
Abstract:We identify three gaps that limit the utility and obstruct the progress of computational text analysis methods (CTAM) for social science research. First, we contend that CTAM development has prioritized technological over validity concerns, giving limited attention to the operationalization of social scientific measurements. Second, we identify a mismatch between CTAMs’ focus on extracting specific contents and document-level patterns, and social science researchers’ need for measuring multiple, often complex contents in the text. Third, we argue that the dominance of English language tools depresses comparative research and inclusivity toward scholarly communities examining languages other than English. We substantiate our claims by drawing upon a broad review of methodological work in the computational social sciences, as well as an inventory of leading research publications using quantitative textual analysis. Subsequently, we discuss implications of these three gaps for social scientists’ uneven uptake of CTAM, as well as the field of computational social science text research as a whole. Finally, we propose a research agenda intended to bridge the identified gaps and improve the validity, utility, and inclusiveness of CTAM.
communication
What problem does this paper attempt to address?