Abstract:The paper focuses on deep learning semantic search algorithms applied in the HR domain. The aim of the article is developing a novel approach to training a Siamese network to link the skills mentioned in the job ad with the title. It has been shown that the title normalization process can be based either on classification or similarity comparison approaches. While classification algorithms strive to classify a sample into predefined set of categories, similarity search algorithms take a more flexible approach, since they are designed to find samples that are similar to a given query sample, without requiring pre-defined classes and labels. In this article semantic similarity search to find candidates for title normalization has been used. A pre-trained language model has been adapted while teaching it to match titles and skills based on co-occurrence information. For the purpose of this research fifty billion title-descriptions pairs had been collected for training the model and thirty three thousand title-description-normalized title triplets, where normalized job title was picked up manually by job ad creator for testing purposes. As baselines FastText, BERT, SentenceBert and JobBert have been used. As a metric of the accuracy of the designed algorithm is Recall in top one, five and ten model's suggestions. It has been shown that the novel training objective lets it achieve significant improvement in comparison to other generic and specific text encoders. Two settings with treating titles as standalone strings, and with included skills as additional features during inference have been used and the results have been compared in this article. Improvements by 10% and 21.5% have been achieved using VacancySBERT and VacancySBERT (with skills) respectively. The benchmark has been developed as open-source to foster further research in the area.

Learning to Match Job Candidates Using Multilingual Bi-Encoder BERT

Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval

Joint Extraction and Classification of Danish Competences for Job Matching

A Deep Learning BERT-Based Approach to Person-Job Fit in Talent Recruitment

Learning to Match Jobs with Resumes from Sparse Interaction Data using Multi-View Co-Teaching Network

JobBERT: Understanding Job Titles through Skills

Career Path Prediction using Resume Representation Learning and Skill-based Matching

VacancySBERT: the approach for representation of titles and skills for semantic similarity search in the recruitment domain

Cross-lingual Transfer of Sentiment Classifiers

Hierarchical Classification of Transversal Skills in Job Ads Based on Sentence Embeddings

SkillMatch: Evaluating Self-supervised Learning of Skill Relatedness

Towards Fully Bilingual Deep Language Modeling

Learning Representations for Soft Skill Matching

Leveraging BERT Language Models for Multi-Lingual ESG Issue Identification

Job and Employee Embeddings: A Joint Deep Learning Approach

Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies

ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models

Leveraging the Inherent Hierarchy of Vacancy Titles for Automated Job Ontology Expansion

Towards Lingua Franca Named Entity Recognition with BERT

Learning Effective Representations for Person-Job Fit by Feature Fusion

Sentence-Level BERT and Multi-Task Learning of Age and Gender in Social Media