Mengting Hu,Shiwan Zhao,Honglei Guo,Chao Xue,Hang Gao,Tiegang Gao,Renhong Cheng,Zhong Su
Abstract:Aspect category detection (ACD) in sentiment analysis aims to identify the aspect categories mentioned in a sentence. In this paper, we formulate ACD in the few-shot learning scenario. However, existing few-shot learning approaches mainly focus on single-label predictions. These methods can not work well for the ACD task since a sentence may contain multiple aspect categories. Therefore, we propose a multi-label few-shot learning method based on the prototypical network. To alleviate the noise, we design two effective attention mechanisms. The support-set attention aims to extract better prototypes by removing irrelevant aspects. The query-set attention computes multiple prototype-specific representations for each query instance, which are then used to compute accurate distances with the corresponding prototypes. To achieve multi-label inference, we further learn a dynamic threshold per instance by a policy network. Extensive experimental results on three datasets demonstrate that the proposed method significantly outperforms strong baselines.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to accurately detect aspect categories in sentences (Aspect Category Detection, ACD) in the case of a small amount of labeled data in sentiment analysis. Specifically, the paper focuses on the multi - label few - shot learning problem, that is, when a sentence may contain multiple aspect categories, how to effectively identify these aspect categories using limited training instances.
### Background and Challenges
- **Limitations of Existing Methods**: Existing ACD methods mainly rely on large - scale labeled data sets, but in practical applications, labeling a large amount of data is both time - consuming and labor - intensive. In addition, even in a large data set, many long - tail aspect categories still face the problem of data sparsity.
- **Advantages of Few - Shot Learning (FSL)**: FSL provides a solution. By using prior knowledge, it can identify new categories using only a small amount of labeled data. However, most FSL methods focus on single - label classification tasks and are not effective for multi - label ACD tasks.
- **Noise Problem**: Sentences in the support set and the support set may contain multiple aspect categories, which leads to the noise problem and makes it difficult to learn good prototypes. At the same time, sentences in the query set may also contain noise, further increasing the difficulty of the task.
### Main Contributions of the Paper
1. **Proposing a Multi - label Few - shot Learning Method**: For the first time, the paper models the ACD problem as a multi - label few - shot learning problem and designs a new method based on the prototypical network to solve this problem.
2. **Designing Two Attention Mechanisms**:
- **Support - set Attention (SA)**: By removing irrelevant aspects, extract the common aspects of each category to generate better prototypes.
- **Query - set Attention (QA)**: Use prototypes to calculate multiple prototype - specific representations of query instances, remove irrelevant aspects, and thus calculate distances more accurately.
3. **Dynamic Threshold Selection**: Learn the dynamic threshold of each instance through the policy network to select aspect categories for positive prediction.
### Experimental Results
- **Performance Improvement**: The experimental results show that the proposed method significantly outperforms strong baseline methods on three data sets.
- **Robustness**: On data sets of multi - aspect sentences, the proposed method still performs excellently, demonstrating its robustness to different data distributions.
- **Ablation Study**: The effectiveness of each module is verified through ablation study, especially the role of the support - set attention and query - set attention modules in reducing noise and improving performance.
In conclusion, by introducing a multi - label few - shot learning framework and two effective attention mechanisms, this paper successfully solves the problem of accurately detecting multiple aspect categories in sentences with limited labeled data.