Abstract:The Reddit r/place experiments were a series of online social experiments hosted by Reddit in 2017, 2022, and 2023, where users were allowed to update the colors of pixels in a large shared canvas. The largest of these experiments (in 2022) has attracted over 100 million users who collaborated and competed to produce elaborate artworks that together provide a unique view of the shared interests connecting the diverse communities on Reddit. The user activity traces resulting from these experiments enable us to analyze how online users engage, collaborate, and compete online at an unprecedented scale. However, this requires labeling millions of updates made during the experiments according to their intended artwork.
This paper characterizes large-scale activity traces from r/place with a focus on dynamics around successful and failed artworks. To achieve this goal, we propose a dynamic graph clustering algorithm to label artworks by leveraging visual and user-level features. %In the first phase of the algorithm, updates within a snapshot of the experiment are grouped based on proximity, color, and user embeddings. In the second phase, clusters across snapshots are merged via an efficient approximation for the set cover problem. We apply the proposed algorithm to the 2017 edition of r/place and show that it outperforms an existing baseline in terms of accuracy and running time. Moreover, we use our algorithm to identify key factors that distinguish successful from failed artworks in terms of user engagement, collaboration, and competition.
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: How to analyze and understand users' participation, collaboration, and competitive behaviors in the Reddit r/place experiment, especially the differences between successful and failed artworks. Specifically, the main challenges faced by researchers include:
1. **Labeling of large - scale user activities**: The r/place experiment attracted a large number of users (more than 100 million), and these users made millions of updates on the huge shared canvas. In order to analyze these activities, each update needs to be accurately assigned to the corresponding artwork. However, in the existing datasets, only the finally completed artworks are manually labeled, and most of the updates have no labels, which makes it difficult to conduct detailed analysis.
2. **Design of dynamic clustering algorithms**: To overcome the above - mentioned labeling problem, researchers proposed an algorithm based on dynamic graph clustering, which can automatically label artworks according to visual features and user behaviors. Through this method, it can be identified which updates belong to the same artwork, and further analyze the differences between successful and failed artworks.
3. **Research on collective behaviors**: By analyzing the data of the three r/place experiments in 2017, 2022, and 2023, researchers hope to reveal the behavior patterns of online users in large - scale collective activities, including how they participate, collaborate, and compete. This not only helps to understand the r/place experiment itself but also provides a theoretical basis for designing more effective online collaboration systems.
### Specific problem descriptions
- **Automatic labeling of user activities**: How to automatically assign millions of updates to the corresponding artworks without relying on manual labeling?
- **Differences between successful and failed artworks**: What factors determine the success or failure of an artwork? For example, user participation, degree of collaboration, intensity of competition, etc.
- **Large - scale data analysis**: How to efficiently process and analyze such a large - scale dataset to extract meaningful collective behavior patterns?
### Solutions
Researchers proposed a two - stage dynamic graph clustering algorithm to solve these problems:
1. **First stage**: Cluster based on updates within snapshots. Use the graph clustering method to preliminarily cluster the updates at each time point according to the color and position information of pixels.
2. **Second stage**: Merge clustering results across time. Through the set - covering algorithm and the merging strategy, merge the clustering results at different time points into complete artworks.
Through this method, researchers not only improved the accuracy of labeling but also significantly reduced the computing time. In addition, they also applied this algorithm to analyze the data in 2017 and found some key features of successful and failed artworks, such as user participation, collaboration, and competition.
In conclusion, this paper aims to solve the problem of automatic labeling of large - scale user activities by developing a new dynamic clustering algorithm and deeply analyze the success and failure factors in online collective behaviors.