COVID-19: Comparative Analysis of Methods for Identifying Articles Related to Therapeutics and Vaccines without Using Labeled Data

Mihir Parmar,Ashwin Karthik Ambalavanan,Hong Guan,Rishab Banerjee,Jitesh Pabla,Murthy Devarakonda
DOI: https://doi.org/10.48550/arXiv.2101.02017
2021-01-05
Information Retrieval
Abstract:Here we proposed an approach to analyze text classification methods based on the presence or absence of task-specific terms (and their synonyms) in the text. We applied this approach to study six different transfer-learning and unsupervised methods for screening articles relevant to COVID-19 vaccines and therapeutics. The analysis revealed that while a BERT model trained on search-engine results generally performed well, it miss-classified relevant abstracts that did not contain task-specific terms. We used this insight to create a more effective unsupervised ensemble.
What problem does this paper attempt to address?