AI4AI: Quantitative Methods for Classifying Host Species from Avian Influenza DNA Sequence

Woo Yong Choi,Kyu Ye Song,Chan Woo Lee
DOI: https://doi.org/10.48550/arXiv.1802.09197
2018-02-26
Quantitative Methods
Abstract:Avian Influenza breakouts cause millions of dollars in damage each year globally, especially in Asian countries such as China and South Korea. The impact magnitude of a breakout directly correlates to time required to fully understand the influenza virus, particularly the interspecies pathogenicity. The procedure requires laboratory tests that require resources typically lacking in a breakout emergency. In this study, we propose new quantitative methods utilizing machine learning and deep learning to correctly classify host species given raw DNA sequence data of the influenza virus, and provide probabilities for each classification. The best deep learning models achieve top-1 classification accuracy of 47%, and top-3 classification accuracy of 82%, on a dataset of 11 host species classes.
What problem does this paper attempt to address?