Detecting Adverse Drug Reactions from Twitter through Domain-Specific Preprocessing and BERT Ensembling

Amy Breden,Lee Moore
DOI: https://doi.org/10.48550/arXiv.2005.06634
2020-05-12
Abstract:The automation of adverse drug reaction (ADR) detection in social media would revolutionize the practice of pharmacovigilance, supporting drug regulators, the pharmaceutical industry and the general public in ensuring the safety of the drugs prescribed in daily practice. Following from the published proceedings of the Social Media Mining for Health (SMM4H) Applications Workshop & Shared Task in August 2019, we aimed to develop a deep learning model to classify ADRs within Twitter tweets that contain drug mentions. Our approach involved fine-tuning $BERT_{LARGE}$ and two domain-specific BERT implementations, $BioBERT$ and $Bio + clinicalBERT$, applying a domain-specific preprocessor, and developing a max-prediction ensembling approach. Our final model resulted in state-of-the-art performance on both $F_1$-score (0.6681) and recall (0.7700) outperforming all models submitted in SMM4H 2019 and during post-evaluation to date.
Computation and Language,Machine Learning,Social and Information Networks
What problem does this paper attempt to address?