Abstract:Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to automatically classify user questions in medical forums so as to direct these questions to appropriate medical experts. Specifically, the author has developed a new BERT - based model (M/e.sc/d.scBERT), which can more accurately identify and classify questions in medical forums and assign them to the correct categories according to the user's intention. ### Background and Problem Description of the Paper With the rise of online medical forums, more and more consumers obtain health - related information through these platforms. However, due to the limited number of medical professionals, it is impossible to meet the needs of all inquiries. Therefore, an automated system is required to help classify and process these inquiries. Specifically, this paper focuses on the **Medical Forum Question Classification (MFQC)** task, that is, classifying questions in medical forums according to the intention of users' posts. ### Main Challenges 1. **Large and Complex Data**: There are a large number of questions in medical forums, and they involve multiple categories of health information needs. 2. **Differences between Professional Terms and Everyday Language**: There are differences between the vocabulary used by consumers and professional medical terms, which makes it difficult for traditional methods to accurately classify. 3. **Limitations of Existing Methods**: Existing medical question classification methods usually rely on hand - designed features or pre - trained word vectors, which will lead to the loss of context information and poor generalization ability on test data. ### Solutions To solve the above problems, the author proposes a new model M/e.sc/d.scBERT based on a dual - encoder architecture. The main features of this model are as follows: 1. **Using Pre - trained Models to Extract Context Representations**: Use pre - trained language models such as BERT to extract the global context representation of the input text, thereby retaining more context information. 2. **Introducing Medical Domain Knowledge as Auxiliary Information**: Extract medical concept words from the medical knowledge base and assign higher weights to these words, so that the model can better understand the specific terms in the medical field. 3. **Combining Global and Local Context Representations**: Improve the accuracy of classification by fusing the global context representation (considering the context of the entire sentence) and the local context representation (especially emphasizing medical concept words). ### Experimental Results The author conducted experiments on two benchmark datasets, ICHI and CADEC. The results show that M/e.sc/d.scBERT has achieved state - of - the - art performance in both single - label and multi - label classification tasks. Especially in the low - resource setting, the performance of M/e.sc/d.scBERT is particularly prominent, significantly outperforming other baseline models. ### Conclusions By introducing medical domain knowledge and combining the powerful context representation ability of pre - trained language models, the M/e.sc/d.scBERT model has shown excellent performance in the medical forum question classification task. Future work can further expand the application range of this model, for example, for structured prediction tasks such as entity and relationship prediction. ### Formula Examples In the paper, the author mentions that the final classification score is calculated by the following formula: \[ \mathbf{C} = [\mathbf{v}_{local}; \mathbf{v}_{global}] \] where $\mathbf{v}_{local}$ and $\mathbf{v}_{global}$ are the vectors of local and global context representations respectively. The final classification score is projected into the space of target categories through a fully - connected layer: \[ \mathbf{C}_{pred} = W \mathbf{C} + b \] Then the Softmax function is applied to obtain the posterior probability distribution: \[ P(y_i | \mathbf{C}) = \frac{\exp(C_{pred,i})}{\sum_j \exp(C_{pred,j})} \] Hopefully, this information can help you better understand the core content of this paper and its solutions. If you have more questions, feel free to continue asking!

Knowledge-Aware Neural Networks for Medical Forum Question Classification

Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings

Efficient Medical Question Answering with Knowledge-Augmented Question Generation

KG-MTT-BERT: Knowledge Graph Enhanced BERT for Multi-Type Medical Text Classification

A Multi-granularity Fusion Neural Network Model for Medical Question Classification

Medical Knowledge-Guided Deep Learning for Imbalanced Medical Image Classification

Transformer-based classification of user queries for medical consultancy with respect to expert specialization

SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining

A Novel Neural Network-Based Method for Medical Text Classification

KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques

KIMedQA: towards building knowledge-enhanced medical QA models

Novel medical question and answer system: Graph convolutional neural network based with knowledge graph optimization

A medical question answering system using large language models and knowledge graphs

Medical Question Understanding and Answering with Knowledge Grounding and Semantic Self-Supervision

Are my answers medically accurate? Exploiting medical knowledge graphs for medical question answering

Progress Notes Classification and Keyword Extraction using Attention-based Deep Learning Models with BERT

A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis

Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study

Optimized Biomedical Question-Answering Services with LLM and Multi-BERT Integration

Leveraging Medical Sentiment to Understand Patients Health on Social Media