Controlling bad-actor-AI activity at scale across online battlefields

Neil F. Johnson,Richard Sear,Lucia Illari
2023-08-02
Abstract:We show how the looming threat of bad actors using AI/GPT to generate harms across social media, can be addressed at scale by exploiting the intrinsic dynamics of the social media multiverse. We combine a uniquely detailed description of the current bad-actor-mainstream battlefield with a mathematical description of its behavior, to show what bad-actor-AI activity will likely dominate, where, and when. A dynamical Red Queen analysis predicts an escalation to daily bad-actor-AI activity by early 2024, just ahead of U.S. and other global elections. We provide a Policy Matrix that quantifies outcomes and trade-offs mathematically for the policy options of containment vs. removal. We give explicit plug-and-play formulae for risk measures.
Physics and Society,Adaptation and Self-Organizing Systems
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: How to effectively control and predict the behavior of bad actors (Bad Actors) using AI to generate harmful content in large - scale online battlefields. Specifically, the paper focuses on the following core issues: 1. **What types of Bad - Actor - AI activities are most likely to occur?** - The paper analyzes the differences between basic forms of GPT (such as GPT - 2) and advanced forms of GPT (such as GPT - 3, 4, etc.), and points out that basic GPT may become the main source of threat due to its availability and ease of deployment. 2. **Where will these activities occur?** - The paper shows the current social media battlefield by drawing a dynamic network graph, especially the extreme anti - X communities (Bad Actor communities) across 13 platforms and their links with mainstream communities. Research shows that small - scale platforms, although small in size, play a crucial role because of their high - link activities. 3. **When will these activities occur?** - The paper uses the Red Queen hypothesis and the random walk model to predict the time pattern of Bad - Actor - AI activities. According to the existing data, it is expected that by the beginning of 2024, Bad - Actor - AI attacks will occur almost daily, which coincides with the upcoming global election time point. 4. **How to mitigate the impact of these activities and predict their results?** - The paper proposes a Policy Matrix, which combines mathematical descriptions to quantify the effects and trade - offs of different policy options (such as containment and removal). In addition, specific formulas are provided to measure risks, for example: \[ n_B(s)=C s^{-\alpha} e^{-\beta s} \] where \(n_B(s)\) represents the number of Bad - Actor - AI clusters with intensity \(s\), \(C\) is a normalization constant, and \(\alpha\) and \(\beta\) are parameters. 5. **How to control these activities on a large scale?** - The paper proposes a mathematical model based on community cluster dynamics to describe and control the Bad - Actor - AI system. This model considers two key equations: \[ T = \frac{2S_B}{S_A - S_B}\ln\left(\frac{S_B(S_A - S_B)}{S_A}\right) \] These equations help predict the time and resource requirements under different strategies and show that even a slight advantage can significantly reduce the intensity distribution of Bad - Actor - AI clusters. Through the research of these problems, the paper aims to provide a scientific basis for policymakers to deal with the increasing AI - driven online hazards.