Named entity recognition in resumes

Ege Kesim,Aysu Deliahmetoglu

2023-06-23

Abstract:Named entity recognition (NER) is used to extract information from various documents and texts such as names and dates. It is important to extract education and work experience information from resumes in order to filter them. Considering the fact that all information in a resume has to be entered to the companys system manually, automatizing this process will save time of the companies. In this study, a deep learning-based semi-automatic named entity recognition system has been implemented with a focus on resumes in the field of IT. Firstly, resumes of employees from five different IT related fields has been annotated. Six transformer based pre-trained models have been adapted to named entity recognition problem using the annotated data. These models have been selected among popular models in the natural language processing field. The obtained system can recognize eight different entity types which are city, date, degree, diploma major, job title, language, country and skill. Models used in the experiments are compared using micro, macro and weighted F1 scores and the performance of the methods was evaluated. Taking these scores into account for test set the best micro and weighted F1 score is obtained by RoBERTa and the best macro F1 score is obtained by Electra model.

Computation and Language

What problem does this paper attempt to address?

The paper is primarily dedicated to addressing the problem of automatically extracting important information from resumes, with a particular focus on resumes in the Information Technology (IT) field. Specifically, the research aims to develop a semi-automatic entity recognition system based on deep learning, capable of identifying eight different types of entity information from resumes, including city, date, degree, major, position, language, country, and regional skills. To tackle this problem, the authors first annotated resumes from job seekers in five different IT-related fields. Then, they employed six pre-trained Transformer-based models and adapted these models for the entity recognition task. The selected models include different variants of BERT (such as BERT-base-cased, BERT-base-uncased), DistilBERT, RoBERTa, XLM-RoBERTa, and ELECTRA. These models are widely used for various tasks in the field of natural language processing. The paper also details the dataset construction process, including how annotation was performed semi-automatically to improve efficiency. Additionally, the experimental section compares the performance of different models in terms of micro, macro, and weighted F1 scores. The final results show that RoBERTa achieved the best micro and weighted F1 scores on the test set, while ELECTRA performed the best in terms of macro F1 score. This indicates that the proposed method can effectively identify the required entity information from resumes to a certain extent, thereby reducing the manual data entry burden on human resources departments.

Named entity recognition in resumes

Resume Information Extraction via Post-OCR Text Processing

Named Entity Recognition based Resume Parser and Summarizer

Semi-supervised deep learning based named entity recognition model to parse education section of resumes

Named Entity Recognition for English Language Using Deep Learning Based Bi Directional LSTM-RNN

A Brief History of Named Entity Recognition

A Named Entity Recognition Approach for Albanian Using Deep Learning

Semi-supervised Bootstrapping approach for Named Entity Recognition

Recent Advances in Named Entity Recognition: A Comprehensive Survey and Comparative Study

Robotic Process Automation for Resume Processing System

Nested and Balanced Entity Recognition using Multi-Task Learning

Annotated Job Ads with Named Entity Recognition

A Novel Chinese Resume Named Entity Recognition Model Based on Lexical Enhancement.

Comprehensive Overview of Named Entity Recognition: Models, Domain-Specific Applications and Challenges

Enhanced conditional random field‐long short‐term memory for name entity recognition in English texts

Query-Based Named Entity Recognition

NanoNER: Named Entity Recognition for nanobiology using experts' knowledge and distant supervision

Online biomedical named entities recognition by data and knowledge-driven model

NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval

Named entity recognition based on a machine learning model

Using LSTM and GRU With a New Dataset for Named Entity Recognition in the Arabic Language