Advancing Automated Content Analysis for a New Era of Media Effects Research: The Key Role of Transfer Learning
Anne Kroon,Kasper Welbers,Damian Trilling,Wouter van Atteveldt,Anne KroonKasper WelbersDamian TrillingWouter van Atteveldta Amsterdam School of Communication Research (ASCoR),University of Amsterdam,Amsterdam,Netherlandsb Department of Communication Science,Vrije Universiteit Amsterdam,Amsterdam,NetherlandsAnne Kroon is an associate professor at the Amsterdam School of Communication Research at the University of Amsterdam. Her research centers on employing computational techniques to explore the causes and consequences of bias in algorithms in the domain of digital job markets.Kasper Welbers is an assistant professor at the Department of Communication Science at the Vrije Universiteit Amsterdam. His research focuses primarily on how the gatekeeping process of news messages has changed due to the rise of new media technologies,and how we can study this using computational methods.Damian Trilling is associate professor of political communication and journalism at the University of Amsterdam. He is interested in news use and dissemination and in the adoption and development of computational methods.Wouter van Atteveldt is professor of Computational Communication Science and Political Communication at the Vrije Universiteit Amsterdam. He focuses on automatic analysis of (political) communication,including both traditional and social media,and the methods and data required for studying this.
DOI: https://doi.org/10.1080/19312458.2023.2261372
IF: 8.044
2023-10-06
Communication Methods and Measures
Abstract:The availability of individual-level digital trace data offers exciting new ways to study media uses and effects based on the actual content that people encountered. In this article, we argue that to really reap the benefits of this data, we need to update our methodology for automated text analysis. We review challenges for the automatic identification of theoretically relevant concepts in texts along three dimensions: format/style , language , and modality . These dimensions unveil a significantly higher level of diversity and complexity in individual-level digital trace data, as opposed to the content traditionally examined through automated text analysis in our field. Consequently, they provide a valuable perspective for exploring the limitations of traditional approaches. We argue that recent developments within the field of Natural Language Processing, in particular, transfer learning using transformer-based models , have the potential to aid the development, application, and performance of various computational tools. These tools can contribute to the meaningful categorization of the content of social (and other) media.
communication