CollabAS2: Enhancing Arabic Answer Sentence Selection Using Transformer-Based Collaborative Models

Asma Aouichat,Ahmed Guessoum
DOI: https://doi.org/10.1007/s13369-024-09345-3
IF: 2.807
2024-09-02
Arabian Journal for Science and Engineering
Abstract:Accurately identifying pertinent text segments as answers to questions is crucial for optimizing question-answering systems, underscoring the pivotal role of precision in Answer Sentence Selection (AS2) modules. This study introduces an innovative AS2 module design leveraging the AraBERT transformer to encode inputs-one for the question and one for the candidate answer-with the goal of enhancing comprehension of both inputs. Each encoded input is subsequently processed in parallel by a collaborative layer employing two distinct deep learning models: a bidirectional long short-term memory (BiLSTM) and a convolutional neural network (CNN). This collaborative approach forms the AraBERT.Collab-BiLSTM/CNN model. Additionally, extensions to the study include AraBERT.Collab-BiLSTM/AVG, incorporating a BiLSTM and AVG collaboration layer, as well as the use of the AraELECTRA pre-trained model, yielding the AraELECTRA.Collab-BiLSTM/CNN and AraELECTRA.Collab-BiLSTM/AVG configurations. Furthermore, the study investigates Arabic word embedding models as alternatives to pre-trained models, resulting in the WordEmb.Collab-BiLSTM/CNN and WordEmb.Collab-BiLSTM/AVG models. Experimental results on our BARAQA (Big-ARAbic-Question-Answering) dataset and the SemEval Arabic Question-Answering corpus demonstrate that the AraELECTRA.Collab-BiLSTM/CNN model achieves high accuracies of 84.64% and 45.93%, respectively. Moreover, the WordEmb.Collab-BiLSTM/AVG model significantly enhances accuracy to 91.61% and 81.23% on the respective datasets, showcasing the effectiveness of our collaborative techniques. Our proposed architecture represents a substantial improvement over previous models, emphasizing the importance of advanced techniques and collaborative strategies in handling complex language structures and diverse text dependencies. Additionally, the study underscores the performance of Arabic transformer-based encoding and suggests further exploration of transformers and collaborative strategies to bolster AS2 performance.
multidisciplinary sciences
What problem does this paper attempt to address?