MambaForGCN: Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis

Adamu Lawan,Juhua Pu,Haruna Yunusa,Aliyu Umar,Muhammad Lawan
2024-10-30
Abstract:Aspect-based Sentiment Analysis (ABSA) evaluates sentiments toward specific aspects of entities within the text. However, attention mechanisms and neural network models struggle with syntactic constraints. The quadratic complexity of attention mechanisms also limits their adoption for capturing long-range dependencies between aspect and opinion words in ABSA. This complexity can lead to the misinterpretation of irrelevant contextual words, restricting their effectiveness to short-range dependencies. To address the above problem, we present a novel approach to enhance long-range dependencies between aspect and opinion words in ABSA (MambaForGCN). This approach incorporates syntax-based Graph Convolutional Network (SynGCN) and MambaFormer (Mamba-Transformer) modules to encode input with dependency relations and semantic information. The Multihead Attention (MHA) and Selective State Space model (Mamba) blocks in the MambaFormer module serve as channels to enhance the model with short and long-range dependencies between aspect and opinion words. We also introduce the Kolmogorov-Arnold Networks (KANs) gated fusion, an adaptive feature representation system that integrates SynGCN and MambaFormer and captures non-linear, complex dependencies. Experimental results on three benchmark datasets demonstrate MambaForGCN's effectiveness, outperforming state-of-the-art (SOTA) baseline models.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced by existing attention mechanisms and neural network models in handling long - distance dependencies in sentiment analysis (especially aspect - based sentiment analysis, ABSA). Specifically, these models have difficulties in capturing the long - distance dependencies between aspect words and opinion words, mainly because the quadratic complexity of the attention mechanism leads to misinterpretation of irrelevant context words, thus limiting their effectiveness mainly to short - distance dependencies. In addition, these models also face challenges in fusing syntactic information and semantic information, being unable to effectively combine and fuse these two types of information, which limits the integration of various information elements. To address the above problems, the paper proposes a new method - MambaForGCN, aiming to enhance the long - distance dependencies between aspect words and opinion words in ABSA. This method encodes the dependencies in the input by introducing a syntax - based graph convolutional network (SynGCN) module and enriches the semantic information of the model through the MambaFormer module. The multi - head attention (MHA) and Mamba blocks in the MambaFormer module serve as channels, enhancing the model's ability to capture short - and long - distance dependencies between aspect words and opinion words. In addition, the paper also introduces the Kolmogorov - Arnold Networks (KANs) gating fusion mechanism, which is an adaptive feature representation system for combining the representations of SynGCN and MambaFormer to capture nonlinear and complex dependencies. In summary, the main contributions of the paper are: firstly, introducing the selective state - space model into ABSA for the first time, significantly enhancing the model's ability to capture long - distance dependencies; using KANs to capture complex dependencies in the text, enabling MambaForGCN to handle nonlinear and high - dimensional interactions; the experimental results show that the MambaForGCN model outperforms some of the state - of - the - art baseline models on three benchmark datasets.