Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis

Xiaoming Shi,Wanxiang Che
DOI: https://doi.org/10.1007/s11704-022-2134-1
IF: 2.6688
2023-01-06
Frontiers of Computer Science
Abstract:Slot filling, to extract entities for specific types of information (slot), is a vitally important modular of dialogue systems for automatic diagnosis. Doctor responses can be regarded as the weak supervision of patient queries. In this way, a large amount of weakly labeled data can be obtained from unlabeled diagnosis dialogue, alleviating the problem of costly and time-consuming data annotation. However, weakly labeled data suffers from extremely noisy samples. To alleviate the problem, we propose a simple and effective Co-Weak-Teaching method. The method trains two slot filling models simultaneously. These two models learn from two different weakly labeled data, ensuring learning from two aspects. Then, one model utilizes selected weakly labeled data generated by the other, iteratively. The model, obtained by the Co-Weak-Teaching on weakly labeled data, can be directly tested on testing data or sequentially fine-tuned on a small amount of human-annotated data. Experimental results on these two settings illustrate the effectiveness of the method with an increase of 8.03% and 14.74% in micro and macro f1 scores, respectively.
computer science, information systems, theory & methods, software engineering
What problem does this paper attempt to address?