Overview of the 8th Social Media Mining for Health Applications (#SMM4H) shared tasks at the AMIA 2023 Annual Symposium
Ari Z Klein,Juan M Banda,Yuting Guo,Ana Lucia Schmidt,Dongfang Xu,Ivan Flores Amaro,Raul Rodriguez-Esteban,Abeed Sarker,Graciela Gonzalez-Hernandez
DOI: https://doi.org/10.1093/jamia/ocae010
2024-04-03
Abstract:Objective: The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. In this paper, we present the annotated corpora, a technical summary of participants' systems, and the performance results. Methods: The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of 5 tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events). Results: In total, 29 teams registered, representing 17 countries. In general, the top-performing systems used deep neural network architectures based on pre-trained transformer models. In particular, the top-performing systems for the classification tasks were based on single models that were pre-trained on social media corpora. Conclusion: To facilitate future work, the datasets-a total of 61 353 posts-will remain available by request, and the CodaLab sites will remain active for a post-evaluation phase.