The Prediction of Hepatitis E through Ensemble Learning

Tu Peng,Xiaoya Chen,Ming Wan,Lizhu Jin,Xiaofeng Wang,Xuejie Du,Hui Ge,Xu Yang
DOI: https://doi.org/10.3390/ijerph18010159
IF: 4.614
2020-12-28
International Journal of Environmental Research and Public Health
Abstract:According to the World Health Organization, about 20 million people are infected with Hepatitis E every year. In 2015, there were 44,000 deaths due to HEV infection worldwide. Food, water and climate are key factors that affect the outbreak of Hepatitis E. This paper presents an ensemble learning model for Hepatitis E prediction by studying the correlation between historical epidemic cases of hepatitis E and environmental factors (water quality and meteorological data). Environmental factors include many features, and ones that are most relevant to HEV are selected and input into the ensemble learning model composed by Gradient Boosting Decision Tree (GBDT) and Random Forest for training and prediction. Three indicators, root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE), are used to evaluate the effectiveness of the ensemble learning model against the classical time series prediction model. It is concluded that the ensemble learning model has a better prediction effect than the classical model, and the prediction effectiveness can be improved by exploiting water quality and meteorological factors (radiation, air pressure, precipitation).
public, environmental & occupational health,environmental sciences
What problem does this paper attempt to address?