Abstract:Clustering is a powerful unsupervised tool for sentiment analysis from text. However, the clustering results may be affected by any step of the clustering process, such as data pre-processing strategy, term weighting method in Vector Space Model and clustering algorithm. This paper presents the results of an experimental study of some common clustering techniques with respect to the task of sentiment analysis. Different from previous studies, in particular, we investigate the combination effects of these factors with a series of comprehensive experimental studies. The experimental results indicate that, first, the K-means-type clustering algorithms show clear advantages on balanced review datasets, while performing rather poorly on unbalanced datasets by considering clustering accuracy. Second, the comparatively newly designed weighting models are better than the traditional weighting models for sentiment clustering on both balanced and unbalanced datasets. Furthermore, adjective and adverb words extraction strategy can offer obvious improvements on clustering performance, while strategies of adopting stemming and stopword removal will bring negative influences on sentiment clustering. The experimental results would be valuable for both the study and usage of clustering methods in online review sentiment analysis.

The Comparison of SOM and K-means for Text Clustering.

A Method of Data Mining Based on SOM Clustering and Its Application

Research of fast SOM clustering for text information

Text Clustering on Oral Conversation Corpus.

A Comparative Analysis of an Extended SOM Network and K-means Analysis

An optimized k-means algorithm of reducing cluster intra-dissimilarity for document clustering

Comparison study of using semantic and syntactic network characteristics to do text clustering

V-SOM: A text clustering method based on dynamic SOM model

A Comparative Study on Text Clustering Methods

A Comparative Study of Clustering Methods for Molecular Data

ConSOM: A conceptional self-organizing map model for text clustering

Research of Clustering Algorithms Based on Text Mining

Research on K-means Text Clustering Algorithm Based on Semantic

Text Clustering Based on Feature Space

A Text Clustering System Based on K-Means Type Subspace Clustering and Ontology

A Comparative Study Of Ontology Based Term Similarity Measures On Pubmed Document Clustering

Weighted K-Means Algorithm Based Text Clustering

Dynamic and Adaptive Self Organizing Maps Applied to High Dimensional Large Scale Text Clustering

How to Perform Incremental Clustering - A SOM Based View

A Particle Swarm Optimization K-Means Algorithm for Mongolian Elements Clustering

Exploring Performance of Clustering Methods on Document Sentiment Analysis