ResuFormer: Semantic Structure Understanding for Resumes via Multi-Modal Pre-training.
Kaichun Yao,Jingshuai Zhang,Chuan Qin 0002,Xin Song,Peng Wang,Hengshu Zhu,Hui Xiong
DOI: https://doi.org/10.1109/icde55515.2023.00242
2023-01-01
Abstract:Understanding the semantic structure of resumes plays an important role for various intelligent recruitment related applications. However, due to the unique characteristics of resume documents (e.g., diverse writing styles and multi-page) and the lack of labeled data, it has been a long-standing challenge to effectively extract the structural information of resumes through machine learning models. While considerable efforts have been made in this direction, existing methods only focus on the textual information in the document where the rich multi-modal information (e.g., the visual and layout information) is largely ignored. To this end, in this paper, we propose ResuFormer for understanding the semantic structure of resumes. Specifically, ResuFormer focuses on two typical tasks in this direction, namely resume block classification and intra-block information extraction respectively. For the first task, we propose a multi-modal pre-training model with a hierarchical Transformer encoder, in which we design three self-supervised training objectives, i.e., masked layout-language model, self-supervised contrastive learning and dynamic next-sentence prediction, to pre-train the model parameters, and fine-tune the model only using a small amount of training data. For the second task, we introduce a self-distillation based self-training learning framework to make the distantly supervised model more robust to the noise data. Finally, extensive experiments conducted on real-world resume datasets have clearly validated the performance of our ResuFormer compared with state-of-the-art (SOTA) baselines.