DATG: Data Augmentation with Transformer-Based Generation for Low-Resource Named Entity Recognition

Qian Yili,Haonan Xu
DOI: https://doi.org/10.1109/CAC59555.2023.10450919
2023-11-17
Abstract:Data augmentation is an important technique for enhancing machine learning performance. In this study, we propose a novel generative data augmentation method for named entity recognition, which addresses the challenge of limited labeled data in professional domains such as biomedical and chemical. The proposed method involves organizing input entities into a Transformer model to produce tagged sentences that are specifically related to the input entities. By leveraging external entity dictionaries, we effectively generate new tagged data and demonstrate its superiority over other data augmentation methods. Our approach offers a potential solution to the NER challenges faced in low-resource languages and specialized domains.
Computer Science,Chemistry
What problem does this paper attempt to address?