Abstract:Text summarization (TS) is considered one of the most difficult tasks in natural language processing (NLP). It is one of the most important challenges that stand against the modern computer system's capabilities with all its new improvement. Many papers and research studies address this task in literature but are being carried out in extractive summarization, and few of them are being carried out in abstractive summarization, especially in the Arabic language due to its complexity. In this paper, an abstractive Arabic text summarization system is proposed, based on a sequence-to-sequence model. This model works through two components, encoder and decoder. Our aim is to develop the sequence-to-sequence model using several deep artificial neural networks to investigate which of them achieves the best performance. Different layers of Gated Recurrent Units (GRU), Long Short-Term Memory (LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) have been used to develop the encoder and the decoder. In addition, the global attention mechanism has been used because it provides better results than the local attention mechanism. Furthermore, AraBERT preprocess has been applied in the data preprocessing stage that helps the model to understand the Arabic words and achieves state-of-the-art results. Moreover, a comparison between the skip-gram and the continuous bag of words (CBOW) word2Vec word embedding models has been made. We have built these models using the Keras library and run-on Google Colab Jupiter notebook to run seamlessly. Finally, the proposed system is evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU evaluation metrics. The experimental results show that three layers of BiLSTM hidden states at the encoder achieve the best performance. In addition, our proposed system outperforms the other latest research studies. Also, the results show that abstractive summarization models that use the skip-gram word2Vec model outperform the models that use the CBOW word2Vec model.

Language Independent Text Summarization of Western European Languages Using Shape Coding of Text Elements

Document vector embedding based extractive text summarization system for Hindi and English text

Multisumm: Towards A Unified Model For Multi-Lingual Abstractive Summarization

On the State of German (Abstractive) Text Summarization

Topic-based Visual Text Summarization and Analysis 1

An Extractive-and-Abstractive Framework for Source Code Summarization.

Automatic Extractive Text Summarization using Multiple Linguistic Features

A Multilingual Study of Compressive Cross-Language Text Summarization

Cross-language Document Summarization Via Extraction and Ranking of Multiple Summaries

Using Bilingual Information for Cross-Language Document Summarization

GATSum: Graph-Based Topic-Aware Abstract Text Summarization

Interpretation-based Code Summarization.

Extractive text summarization using deep learning approach

An Integrated Graph Model For Document Summarization

GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages

Tell me what I need to know: Exploring LLM-based (Personalized) Abstractive Multi-Source Meeting Summarization

Abstractive Arabic Text Summarization Based on Deep Learning

A study of extractive summarization of long documents incorporating local topic and hierarchical information

Summaformers @ LaySumm 20, LongSumm 20

Assessment of Transformer-Based Encoder-Decoder Model for Human-Like Summarization

Implementing Deep Learning-Based Approaches for Article Summarization in Indian Languages