RADM-DRE:Retrieval Augmentation for Document-Level Relation Extraction with Diffusion Model

Qing Zhang,Qingsong Yuan,Jianyong Duan,Yuhang Jiang,Hao Wang,Zhengxin Gao,Li He,Jie Liu
DOI: https://doi.org/10.1109/ialp61005.2023.10337090
2023-01-01
Abstract:Existing data augmentation methods attempt to utilize more raw samples or incorporate external knowledge to enhance the model, with the assumption that the explicit data pool for retrieval must be accessible in both training and testing stages. We argue that the data generated from the distribution of raw data beyond the raw data itself can provide more informative augmentation and can relax the strong assumption that the original raw data must be accessible in testing stage. To address this issue, we propose a novel framework that introduces a diffusion model for the first time. The Diffusion Model aims to generate data with diversity by directly inheriting the attribute of diffusion model from the data distribution, serving as a data generator. However, the raw text is discrete which is hard to generate via diffusion model directly. Thus, we model the original data in a transformed continuous embedding space, and conduct retrieval from that data distribution. Then, we concatenate the retrieval results with the original features for augmentation. Experimental results on the public datasets DocRED, CDR, and GDA demonstrate promising performance.
What problem does this paper attempt to address?