Identifying and Aligning Medical Claims Made on Social Media with Medical Evidence

Anthony Hughes,Xingyi Song
2024-05-18
Abstract:Evidence-based medicine is the practice of making medical decisions that adhere to the latest, and best known evidence at that time. Currently, the best evidence is often found in the form of documents, such as randomized control trials, meta-analyses and systematic reviews. This research focuses on aligning medical claims made on social media platforms with this medical evidence. By doing so, individuals without medical expertise can more effectively assess the veracity of such medical claims. We study three core tasks: identifying medical claims, extracting medical vocabulary from these claims, and retrieving evidence relevant to those identified medical claims. We propose a novel system that can generate synthetic medical claims to aid each of these core tasks. We additionally introduce a novel dataset produced by our synthetic generator that, when applied to these tasks, demonstrates not only a more flexible and holistic approach, but also an improvement in all comparable metrics. We make our dataset, the Expansive Medical Claim Corpus (EMCC), available at
Computation and Language,Social and Information Networks
What problem does this paper attempt to address?
The main problem this paper attempts to address is aligning medical claims posted on social media with existing medical evidence. Specifically, the researchers focus on three core tasks: 1. **Identifying Medical Claims**: Recognizing which content in social media text constitutes medical claims. 2. **Extracting Medical Vocabulary**: Extracting relevant medical terms from these medical claims. 3. **Retrieving Relevant Evidence**: Retrieving relevant medical evidence for the identified medical claims. By addressing these issues, the research aims to help people without a medical background more effectively assess the accuracy of medical information on social media, thereby making more informed health decisions. To this end, the researchers propose a new system capable of generating synthetic medical claims to assist in completing the above three core tasks. Additionally, they introduce a new dataset—the Expansive Medical Claim Corpus (EMCC), which is produced by their synthetic generator and demonstrates a more flexible, comprehensive approach and performance improvement across all comparable metrics.