Multi-label text classification of cardiovascular drug attributes based on BERT and BiGRU.

Hongzhen Cui,Longhao Zhang,Xiaoyue Zhu,Xiuping Guo,Yunfeng Peng
DOI: https://doi.org/10.3233/jifs-236115
2024-01-01
Abstract:Extracting and digitizing drug attributes from medical literature is the first step to build a knowledge computing system for precision disease treatment. In order to build a cardiovascular drug knowledge base, this paper proposes a multi-label text classification method for cardiovascular drug attributes from the Chinese drug guideline. The drug attributes are characterized by a BERT pre-trained model, and a dual-feature extraction structure is proposed based on the BiGRU neural network to capture high-level semantic information. Label categorization of cardiovascular drug attributes, such as indications and mode of administration, is accomplished. The F1 score of 0.8431 was obtained using 5-fold cross-validation. Comparing KNN and Naïve bayes, and conducting CNN and BiGRU control experiments on the basis of Word2Vec characterization of medication guidelines, the proposed multi-label text classification method is effective and the F1 value is significantly improved. Proved by analysis of ablation and crossover experiments, the proposed method can achieve a high accuracy rate averaged at 0.8339.
What problem does this paper attempt to address?