Abstractive Text Summarization for Resumes With Cutting Edge NLP Transformers and LSTM

Öykü Berfin Mercan,Sena Nur Cavsak,Aysu Deliahmetoglu,Senem Tanberk
2023-06-23
Abstract:Text summarization is a fundamental task in natural language processing that aims to condense large amounts of textual information into concise and coherent summaries. With the exponential growth of content and the need to extract key information efficiently, text summarization has gained significant attention in recent years. In this study, LSTM and pre-trained T5, Pegasus, BART and BART-Large model performances were evaluated on the open source dataset (Xsum, CNN/Daily Mail, Amazon Fine Food Review and News Summary) and the prepared resume dataset. This resume dataset consists of many information such as language, education, experience, personal information, skills, and this data includes 75 resumes. The primary objective of this research was to classify resume text. Various techniques such as LSTM, pre-trained models, and fine-tuned models were assessed using a dataset of resumes. The BART-Large model fine-tuned with the resume dataset gave the best performance.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to address the problem of generating abstractive text summaries in resume texts. With the explosive growth of text data, the demand for efficiently extracting key information from a large number of documents is increasing, and text summarization technology has thus received widespread attention. However, in existing research, there has not yet been a study specifically focused on abstractive summarization for resume texts. This paper aims to fill this gap by evaluating the performance of different models (such as LSTM, pre-trained T5, Pegasus, BART, and its larger version BART-Large) on open-source datasets (Xsum, CNN/Daily Mail, Amazon Fine Food Review, and News Summary) as well as a self-built resume dataset, to explore the most suitable model for resume text summarization. Ultimately, the fine-tuned BART-Large model performed the best on the resume dataset, indicating the effectiveness of this model in handling such specific tasks.