Happenstance: Utilizing Semantic Search to Track Russian State Media Narratives about the Russo-Ukrainian War On Reddit

Hans W. A. Hanley,Deepak Kumar,Zakir Durumeric
DOI: https://doi.org/10.48550/arXiv.2205.14484
2023-05-31
Abstract:In the buildup to and in the weeks following the Russian Federation's invasion of Ukraine, Russian state media outlets output torrents of misleading and outright false information. In this work, we study this coordinated information campaign in order to understand the most prominent state media narratives touted by the Russian government to English-speaking audiences. To do this, we first perform sentence-level topic analysis using the large-language model MPNet on articles published by ten different pro-Russian propaganda websites including the new Russian "fact-checking" website <a class="link-external link-http" href="http://waronfakes.com" rel="external noopener nofollow">this http URL</a>. Within this ecosystem, we show that smaller websites like <a class="link-external link-http" href="http://katehon.com" rel="external noopener nofollow">this http URL</a> were highly effective at publishing topics that were later echoed by other Russian sites. After analyzing this set of Russian information narratives, we then analyze their correspondence with narratives and topics of discussion on the r/Russia and 10 other political subreddits. Using MPNet and a semantic search algorithm, we map these subreddits' comments to the set of topics extracted from our set of Russian websites, finding that 39.6% of r/Russia comments corresponded to narratives from pro-Russian propaganda websites compared to 8.86% on r/politics.
Social and Information Networks,Computers and Society,Machine Learning
What problem does this paper attempt to address?