Scott Golder,Bernardo A. Huberman
Abstract:Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content. Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks, photographs and other content. In this paper we analyze the structure of collaborative tagging systems as well as their dynamical aspects. Specifically, we discovered regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given url. We also present a dynamical model of collaborative tagging that predicts these stable patterns and relates them to imitation and shared knowledge.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to understand the structure of collaborative tagging systems and their dynamic characteristics. Specifically, by analyzing phenomena such as user activities, tag frequencies, tag types, and the outbreaks of bookmark popularity, the author reveals the regularities existing in these systems and explores the relationships between these regularities and imitation and shared knowledge.
### Main problems
1. **Understanding the structure of collaborative tagging systems**:
- Collaborative tagging systems allow users to add metadata (such as keywords) to shared content. Such systems are becoming increasingly popular on the Internet, especially on some websites that allow users to tag bookmarks, photos, etc.
- The author analyzed the structure of Delicious, a collaborative tagging system, and discovered regularities in aspects such as user activities, tag frequencies, and tag types.
2. **Exploring the dynamic characteristics of collaborative tagging systems**:
- The author not only focuses on the static structure but also studies the dynamic changes of the system, such as the outbreaks of bookmark popularity and the stability of tag proportions.
- They found that, although the number of users is constantly increasing and different users' tag selections vary, over time, the tag proportions show significant stability in a given URL.
3. **Proposing a dynamic model**:
- The author proposed a dynamic model to predict these stable patterns and related them to imitation and shared knowledge. This helps to explain why certain tags maintain a consistent proportion among multiple users.
### Formula representation
In the paper, although no complex mathematical formulas are directly used, in order to describe certain phenomena, some simple statistical formulas can be introduced to express key points:
- **Tag frequency distribution**: Assume that the tag frequency distribution can be described by a power - law distribution. The formula is:
\[
P(x) \sim x^{-\alpha}
\]
where \(P(x)\) represents the probability that the tag appears with a frequency of \(x\), and \(\alpha\) is the power - law index.
- **Tag proportion stability**: For a given URL, the stability of the tag proportion can be represented by the relative frequency \(f_i\):
\[
f_i=\frac{n_i}{N}
\]
where \(n_i\) is the number of times tag \(i\) appears, and \(N\) is the total number of all tags.
### Conclusion
Through these analyses, the author hopes to better understand the working mechanism of collaborative tagging systems and provide a theoretical basis for future research. In addition, they also discuss the potential uses of these data, such as how to use user - generated data to improve information retrieval and classification systems.