Word Storms: Multiples of Word Clouds for Visual Comparison of Documents

Quim Castella,Charles Sutton
DOI: https://doi.org/10.48550/arXiv.1301.0503
2013-01-04
Abstract:Word clouds are a popular tool for visualizing documents, but they are not a good tool for comparing documents, because identical words are not presented consistently across different clouds. We introduce the concept of word storms, a visualization tool for analysing corpora of documents. A word storm is a group of word clouds, in which each cloud represents a single document, juxtaposed to allow the viewer to compare and contrast the documents. We present a novel algorithm that creates a coordinated word storm, in which words that appear in multiple documents are placed in the same location, using the same color and orientation, in all of the corresponding clouds. In this way, similar documents are represented by similar-looking word clouds, making them easier to compare and contrast visually. We evaluate the algorithm in two ways: first, an automatic evaluation based on document classification; and second, a user study. The results confirm that unlike standard word clouds, a coordinated word storm better allows for visual comparison of documents.
Information Retrieval,Digital Libraries,Human-Computer Interaction
What problem does this paper attempt to address?