Arabic Opinion Mining Using Distributed Representations of Documents

Alaa M. El-Halees
DOI: https://doi.org/10.1109/picict.2017.15
2017-05-01
Abstract:Nowadays, many people express their opinions using user generated contains such as social media, forums and reviews. Opinion mining is a field of study that extracts sentiments from user generated contents. Because of the complexity of the Arabic language, extracting those opinions are challenging. Better representation of reviews can help to improve extraction of opinions. The traditional way of representing opinion documents is using Bag-of-Words where the word is presented in fixed-length. The problem of this presentation is that it loses the order of the word and it ignores grammatical structure and lexicon-dependent. To overcome these limitations, distributed representations can be employed. It is based on learning vector representations of words., which also called “word embeddings”. It can make the performance of natural language processing tasks have better performance with the help of learning algorithms. This representation uses neural networks and makes the learned vectors explicitly encode many linguistic patterns. In this study., we used distributed representations for Arabic opinion mining and compare it with Bag of Words (BOW) representation. We applied them on four benchmark datasets. Then., we used four machine learning methods which are Support Vector Machine., Logistic Regression and Random Forest. Using f-measure metric., we found that., in all datasets and all methods we used in our experiment, the distributed representations have better performance than bag-of- words representation.
What problem does this paper attempt to address?