Abstract:We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms. We propose a simple theoretical model of a collective interacting with a firm's learning algorithm. The collective pools the data of participating individuals and executes an algorithmic strategy by instructing participants how to modify their own data to achieve a collective goal. We investigate the consequences of this model in three fundamental learning-theoretic settings: the case of a nonparametric optimal learning algorithm, a parametric risk minimizer, and gradient-based optimization. In each setting, we come up with coordinated algorithmic strategies and characterize natural success criteria as a function of the collective's size. Complementing our theory, we conduct systematic experiments on a skill classification task involving tens of thousands of resumes from a gig platform for freelancers. Through more than two thousand model training runs of a BERT-like language model, we see a striking correspondence emerge between our empirical observations and the predictions made by our theory. Taken together, our theory and experiments broadly support the conclusion that algorithmic collectives of exceedingly small fractional size can exert significant control over a platform's learning algorithm.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: **How can collective action through algorithms influence the learning results of machine - learning algorithms on digital platforms?** Specifically, the authors studied how a collective composed of individuals can guide the machine - learning algorithm optimization process on the platform through coordinated actions (such as modifying data) to achieve the collective's goals. ### Main Contributions 1. **Establishment of a theoretical model**: - Propose a simple theoretical model to describe the interaction between the collective and the company's learning algorithm. - The collective aggregates the personal data of participants and executes an algorithmic strategy to guide participants on how to modify their own data to achieve the collective goal. - When the company processes these modified data, it will adjust its machine - learning model. 2. **Strategy analysis under three learning - theory settings**: - **Non - parametric optimal learning**: Study how, in the optimal case, the collective can make the classifier associate specific signals and target labels by modifying data points. - **Parametric risk minimization**: Explore how the collective can influence the parametric risk minimization problem so that the finally selected model is close to the target model set by the collective. - **Gradient - based optimization**: Analyze how, in a non - convex optimization environment, the collective can influence the learning process by controlling the gradient. 3. **Empirical evaluation**: - Through a large number of experiments on the freelancer platform, the validity of the theoretical predictions was verified. - Experiments show that even a very small proportion of the collective (such as less than 1% of the population) can significantly influence the results of the machine - learning model. ### Key Formulas - **Mixed distribution**: \[ P=\alpha P^{*}+(1 - \alpha)P_{0} \] where \(P^{*}\) is the data distribution generated under the collective strategy, \(P_{0}\) is the original data distribution, and \(\alpha\) is the proportion of the collective. - **Lower bound of the success probability (feature - label strategy)**: \[ S(\alpha)\geq1-\frac{1 - \alpha}{\alpha}\cdot(1 - \epsilon)\Delta+\frac{\epsilon}{1 - 2\epsilon}\cdot\xi \] where \(\xi\) represents the uniqueness of the signal, \(\Delta\) represents the sub - optimality gap, and \(\epsilon\) represents the sub - optimality of the classifier. - **Critical Mass**: \[ \alpha^{*}\leq\frac{(1 - \epsilon)\Delta+\epsilon}{(1 - S^{*})((1 - \epsilon)\Delta+\epsilon)+(1 - 2\epsilon)\cdot\xi} \] This formula gives the minimum collective proportion \(\alpha^{*}\) required to achieve the target success rate \(S^{*}\). ### Conclusion The research in this paper shows that even a very small part of the collective can have a significant impact on the machine - learning algorithms of the platform. This impact can be achieved through coordinated algorithmic strategies, thus providing a new means for workers or consumers on the platform to change the behavior of algorithms.

Algorithmic Collective Action in Machine Learning

The Role of Learning Algorithms in Collective Action

Let's Influence Algorithms Together: How Millions of Fans Build Collective Understanding of Algorithms and Organize Coordinated Algorithmic Actions

Online Algorithmic Recourse by Collective Action

Naive Algorithmic Collusion: When Do Bandit Learners Cooperate and When Do They Compete?

An effective theory of collective deep learning

A Simple Generative Model of Collective Online Behaviour

Human-Algorithm Interaction Biases in the Big Data Cycle: A Markov Chain Iterated Learning Framework

Collective Innovation in Groups of Large Language Models

Artificial Intelligence and Spontaneous Collusion

Rethinking Machine Learning Collective Communication as a Multi-Commodity Flow Problem

Big Cooperative Learning

Algorithms, Machine Learning, and Collusion

How large language models can reshape collective intelligence

Algorithmic Collusion in Dynamic Pricing with Deep Reinforcement Learning

Algorithmic Collective Action in Recommender Systems: Promoting Songs by Reordering Playlists

Algorithmic Collusion: Insights from Deep Learning

Adversarial Dynamics in Centralized Versus Decentralized Intelligent Systems

State of the Art in Fair ML: From Moral Philosophy and Legislation to Fair Classifiers

Decline Now: A Combinatorial Model for Algorithmic Collective Action

Mutual benefits of social learning and algorithmic mediation for cumulative culture