EMSI-BERT: Asymmetrical Entity-Mask Strategy and Symbol-Insert Structure for Drug-Drug Interaction Extraction Based on BERT.

Zhong Huang,Ning An,Juan Liu,Fuji Ren
DOI: https://doi.org/10.3390/sym15020398
2023-01-01
Symmetry
Abstract:Drug-drug interaction (DDI) extraction has seen growing usage of deep models, but their effectiveness has been restrained by limited domain-labeled data, a weak representation of co-occurring entities, and poor adaptation of downstream tasks. This paper proposes a novel EMSI-BERT method for drug–drug interaction extraction based on an asymmetrical Entity-Mask strategy and a Symbol-Insert structure. Firstly, the EMSI-BERT method utilizes the asymmetrical Entity-Mask strategy to address the weak representation of co-occurring entity information using the drug entity dictionary in the pre-training BERT task. Secondly, the EMSI-BERT method incorporates four symbols to distinguish different entity combinations of the same input sequence and utilizes the Symbol-Insert structure to address the week adaptation of downstream tasks in the fine-tuning stage of DDI classification. The experimental results showed that EMSI-BERT for DDI extraction achieved a 0.82 F1-score on DDI-Extraction 2013, and it improved the performances of the multi-classification task of DDI extraction and the two-classification task of DDI detection. Compared with baseline Basic-BERT, the proposed pre-training BERT with the asymmetrical Entity-Mask strategy could obtain better effects in downstream tasks and effectively limit “Other” samples’ effects. The model visualization results illustrated that EMSI-BERT could extract semantic information at different levels and granularities in a continuous space.
What problem does this paper attempt to address?