Eliciting Disease Data from Wikipedia Articles

Geoffrey Fairchild,Lalindra De Silva,Sara Y. Del Valle,Alberto M. Segre
DOI: https://doi.org/10.5210/ojphi.v8i1.6526
2016-03-24
Online Journal of Public Health Informatics
Abstract:Traditional disease surveillance systems suffer from several disadvantages, including reporting lags and antiquated technology, that have caused a movement towards internet-based disease surveillance systems. This study presents the use of Wikipedia article content in this sphere. We demonstrate how a named-entity recognizer can be trained to tag case, death, and hospitalization counts in the article text. We also show that there are detailed time series data that are consistently updated that closely align with ground truth data. We argue that Wikipedia can be used to create the first community-driven open-source emerging disease detection, monitoring, and repository system.
English Else
What problem does this paper attempt to address?