Zachary BerglundElma Kontor-ManuSamuel Biano JacundinoYaohua Fenga Department of Food Science,Purdue University,West Lafayette,IN,USAb Food Engineering School,University of Campinas,São Paulo,Brazil
Abstract:Machine learning approaches are increasingly being adopted as data analysis tools in scientific behavioral predictions. This paper utilizes a machine learning approach, Random Forest Model, to determine the top prediction variables of food safety behavioral changes during the pandemic. Data was collected among U.S. consumers on risk perception of COVID-19 and foodborne illness (FBI), food safety practice behaviors and demographics through online surveys at ten different time points from April 2020 through to May 2021; and post pandemic in May 2022. Random forest model was used to predict 14 food safety-related behaviors. The models for predicting Handwashing before cooking and Handwashing after eating had a good performance, with F-1 score of 0.93 and 0.88, respectively. Attitudes- related variables were determined to be important in predicting food safety behaviors. The importance ranking of the predicting variables were found to be changing over time.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to use machine - learning algorithms (especially the random forest model) to identify and predict the changes in American consumers' food safety behaviors during the COVID - 19 pandemic and their main influencing factors. Specifically, the research aims to analyze the data collected at different time points by constructing a random - forest model to determine the key variables affecting food - safety behaviors and explore how these variables change over time. The focus of the research lies in understanding the changes in consumers' food - safety behaviors during the pandemic and the main driving factors behind these changes, thereby providing a basis for risk - communication strategies in public - health events.
### Research Background
- **Food - safety issues**: Improper food - handling behaviors in domestic kitchens put consumers at risk of food - borne diseases.
- **Impact of the pandemic**: Since the COVID - 19 pandemic, consumers' eating habits, food - handling and purchasing behaviors have changed. Many consumers have begun to cook at home more often, which may be a long - lasting change.
- **Behavior - change models**: Traditional behavior - change models such as the Theory of Planned Behavior are used to understand human behaviors, but these models are usually based on social - cognitive variables such as attitude, subjective norm and perceived - barrier control.
- **Machine - learning methods**: Machine - learning algorithms, especially the random - forest model, can be used to predict and explore food - safety behaviors in large - scale data sets, providing more flexible and non - parametric research methods.
### Research Objectives
- **Identify key variables**: Use the random - forest model to identify the most important variables that affect consumers' food - safety behaviors during the pandemic.
- **Explore variable changes**: Analyze the changes of these key variables at different time points to understand how they change with the development of the pandemic.
- **Provide policy suggestions**: Provide effective food - safety communication strategies for regulatory agencies and other relevant agencies and establish a precedent for incorporating machine - learning models into the prediction of food - safety behaviors.
### Methods
- **Data collection**: Data on American consumers' risk perceptions of COVID - 19 and food - borne diseases, food - safety behavior practices and demographic characteristics were collected through online surveys. The surveys were conducted 10 times from April 2020 to May 2021, with at least 700 data points in each survey, totaling 7,355 data points.
- **Data processing**: After data export, a variety of pre - processing was carried out, including creating composite variables, encoding categorical variables, rescaling variable values and removing highly - correlated variables.
- **Model construction**: 14 random - forest models of food - safety - related behaviors were constructed using IBM SPSS Modeler, with 10 models constructed at each time point, for a total of 420 models.
- **Performance evaluation**: The F1 score was used to evaluate the performance of binary - variable models, and the root - mean - square error (RMSE) was used to evaluate the performance of continuous - variable models.
### Results
- **Model performance**: Most models performed well, especially the models regarding "washing hands before cooking" and "washing hands before eating", with F1 scores of 0.93 and 0.88 respectively.
- **Key variables**: Attitude - related variables (such as risk perceptions of COVID - 19 and food safety) are important factors in predicting food - safety behaviors. The importance rankings of these variables change over time.
- **Variable - change trends**: As the number of new monthly COVID - 19 cases changes, the importance rankings of attitude variables related to food safety also change. When the number of new monthly cases increases, the importance ranking of attitude variables related to COVID - 19 decreases, while the importance ranking of attitude variables related to food safety increases.
### Conclusions
This research successfully identified the key variables affecting consumers' food - safety behaviors during the pandemic through the random - forest model and revealed the change trends of these variables over time. These findings are helpful for formulating more effective food - safety communication strategies and improving the public's food - safety awareness.