A System for Worldwide COVID-19 Information Aggregation
Akiko Aizawa,Frederic Bergeron,Junjie Chen,Fei Cheng,Katsuhiko Hayashi,Kentaro Inui,Hiroyoshi Ito,Daisuke Kawahara,Masaru Kitsuregawa,Hirokazu Kiyomaru,Masaki Kobayashi,Takashi Kodama,Sadao Kurohashi,Qianying Liu,Masaki Matsubara,Yusuke Miyao,Atsuyuki Morishima,Yugo Murawaki,Kazumasa Omura,Haiyue Song,Eiichiro Sumita,Shinji Suzuki,Ribeka Tanaka,Yu Tanaka,Masashi Toyoda,Nobuhiro Ueda,Honai Ueoka,Masao Utiyama,Ying Zhong
DOI: https://doi.org/10.48550/arXiv.2008.01523
2020-07-28
Computation and Language
Abstract:The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-19 information aggregation containing reliable articles from 10 regions in 7 languages sorted by topics. Our reliable COVID-19 related website dataset collected through crowdsourcing ensures the quality of the articles. A neural machine translation module translates articles in other languages into Japanese and English. A BERT-based topic-classifier trained on our article-topic pair dataset helps users find their interested information efficiently by putting articles into different categories.