A Semantically Enriched Dataset based on Biomedical NER for the COVID19 Open Research Dataset Challenge

Hermann Kroll,Jan Pirklbauer,Johannes Ruthmann,Wolf-Tilo Balke
DOI: https://doi.org/10.48550/arXiv.2005.08823
2020-05-18
Abstract:Research into COVID-19 is a big challenge and highly relevant at the moment. New tools are required to assist medical experts in their research with relevant and valuable information. The COVID-19 Open Research Dataset Challenge (CORD-19) is a "call to action" for computer scientists to develop these innovative tools. Many of these applications are empowered by entity information, i. e. knowing which entities are used within a sentence. For this paper, we have developed a pipeline upon the latest Named Entity Recognition tools for Chemicals, Diseases, Genes and Species. We apply our pipeline to the COVID-19 research challenge and share the resulting entity mentions with the community.
Digital Libraries
What problem does this paper attempt to address?