Text sentiment classification of Amazon reviews using word embeddings and convolutional neural networks

Mohammed Qorich,Rajae El Ouazzani
DOI: https://doi.org/10.1007/s11227-023-05094-6
IF: 3.3
2023-02-18
The Journal of Supercomputing
Abstract:Nowadays, the internet and social media ease the process of reviewing products. Consumers expose their thoughts, opinions, and experiences about products and services on various forums, websites, and mobile apps. In effect, internet reviews become a decision-maker for many people before getting their desired goods. Actually, text sentiment analysis consists of extracting insights and sentiments from social texts and consumers’ reviews. Hence, various organizations conduct this analysis in order to better understand the attitude as well as the feedback of their customers toward their products. Besides, many scientific researchers are also interested in the analysis of customers’ reviews by labeling them into a set of sentiments using some text classification algorithms. The following paper provides a convolutional neural network (CNN) model to classify text reviews’ sentiments as negative or positive. Also, we make a comparative analysis using our proposed CNN model and several models’ representations of word embedding to get the most efficient model. The experiments are implemented on the Amazon reviews dataset, and the diverse model designs have achieved appropriate performances. Reached results discern the importance of including stop-word in sentiment analysis tasks, in fact, the stop words elimination can provoke an inaccurate prediction of sentiments. Practically, using stop words with the CNN model has improved the accuracy result by 2% opposing the CNN model that has ignored them. Furthermore, we procure that the employment of a random initialization approach provides better performance than supervised and embedding model vectors on large-scale datasets. Effectively, training the representation of word embedding allows the model to learn better features in less computation time. Moreover, our CNN model showed better performance than the baseline machine learning and deep learning methods and improved the accuracy of the CNN to 90% on the Amazon reviews dataset.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?