An Empirical Comparison of Topics in Twitter and Traditional Media

Wayne Xin Zhao,Jing Jiang
2011-01-01
Abstract:Twitter as a new form of social media can potentially contain much useful information, but content analysis on Twitter has not been well studied. In particular, it is not clear whether as an information source Twitter can be simply regarded as a faster news feed that covers mostly the same information as traditional news media. In This paper we empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling. We use a Twitter-LDA model to discover topics from a representative sample of the entire Twitter. We then use text mining techniques to compare these Twitter topics with topics from New York Times, taking into consideration of topic categories and types. We find that although Twitter and New York Times cover similar categories and types of topics, the distributions of topic categories and types are quite different. Furthermore, there are Twitter-specific topics and NYT-specific topics, and they tend to belong to certain topic categories and types. We also study the relation between the proportions of opinionated tweets and retweets and topic categories and types, and find some interesting dependence. To the best of our knowledge, ours is the first comprehensive empirical comparison between Twitter and traditional news media.
What problem does this paper attempt to address?