Basic tasks of sentiment analysis

Iti Chaturvedi,Soujanya Poria,Erik Cambria
DOI: https://doi.org/10.1007/978-1-4614-7163-9_110159-1
2017-10-18
Abstract:Subjectivity detection is the task of identifying objective and subjective sentences. Objective sentences are those which do not exhibit any sentiment. So, it is desired for a sentiment analysis engine to find and separate the objective sentences for further analysis, e.g., polarity detection. In subjective sentences, opinions can often be expressed on one or multiple topics. Aspect extraction is a subtask of sentiment analysis that consists in identifying opinion targets in opinionated text, i.e., in detecting the specific aspects of a product or service the opinion holder is either praising or complaining about.
Computation and Language
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the accuracy and comprehensiveness of sentiment analysis (Sentiment Analysis), especially when dealing with comments on the Internet and social media data. Specifically, the author focuses on two key tasks: 1. **Subjectivity Detection**: - Identify subjective and objective sentences in the text. Objective sentences do not contain emotional information, so these sentences need to be filtered out to ensure more accurate subsequent sentiment polarity classification. - Subjective sentences usually contain opinions on one or more topics, so useful information needs to be extracted from these sentences. 2. **Aspect Extraction**: - Identify the specific aspects of the opinion target (such as a certain feature of a product). For example, in a comment about a mobile phone, "The screen is very good and the resolution is also very high", where "screen" and "resolution" are specific opinion targets. - This task helps to correctly assign sentiment polarity to different features instead of simply giving an overall average polarity. ### Core Problems of the Paper - **How to Effectively Filter Non - opinion Information**: In sentiment analysis, a lot of factual or non - opinionated information needs to be filtered out to ensure that only emotional texts are analyzed. - **How to Handle Opinions on Different Aspects**: Users often express opinions on different aspects of a product or service in their comments, not just an overall evaluation of the entire product or service. ### Solutions To address these problems, the author proposes the following methods: - **Deep Convolutional Neural Networks (CNNs)**: By using CNNs to process text data, it can better capture the local features in sentences and can combine temporal dynamic information (such as dynamic Gaussian Bayesian networks) to model the dependency relationships between sentences. - **Combination of Linguistic Features and Deep - learning Features**: Combine traditional linguistic features (such as bag - of - words model, dependency relations, etc.) with deep - learning models to improve the effect of sentiment detection. - **Latent Dirichlet Allocation (LDA)**: Used to extract and group aspects, analyze the semantic distribution of documents by introducing the latent variable "topic". ### Formula Representation Some formulas involved in the paper include: - **Conditional Probability**: \[ p(x_i(t)) = P(x_i(t) | (x_1(t), x_2(t),..., x_{i - 1}(t)), (s(1), s(2),..., s(t - 1))) \] It represents the probability of the word \(x_i(t)\) given the preceding text and context. - **Joint Probability of Gaussian Bayesian Network**: \[ p(X|S, \theta) = \prod_{i = 1}^N p(x_i | a_i, \theta_i, a_i) \] where \(p(x_i | a_i, \theta_i, a_i)\) is the conditional probability of node \(x_i\) given its parent node \(a_i\) and parameters \(\theta_i, a_i\). - **Convolution Operation**: \[ c_j = w_k^T\cdot x_{i:i + k - 1} \] It represents the dot product result of the convolution kernel \(w_k\) and each k - gram in the sentence \(s(t)\). Through these methods, the paper aims to improve the accuracy and robustness of sentiment analysis, especially when dealing with complex and multi - aspect comment data.