To understand machine-learning text analysis in order to more utilize interview text in the area of social welfare

Nam Kyoung Jo,Ki-Ho Song,
DOI: https://doi.org/10.17997/swry.79.1.1
2023-12-31
The Center for Social Welfare Research Yonsei University
Abstract:This study aims to examine the machine learning analysis of text data step by step in order to enable colleagues in the field to conceptually understand the method and to think about the limit and possibility of it. Analyzing text data requires the process of transform atypical text data into numerical expression, which means we need to deal with surprisingly large amount of variables (that is, features), but at the same time, means there is no ‘magical’ thing that we cannot understand. The process of constructing a prediction model also uses methods familiar to us such as binary logistics regression, although those are not all of the list of methods. However, machine learning model is evaluated not by inferential validity but by predicting ability, which shows approaches of Big data analysis and of conventional inferential statistical analysis are fundamentally different. Thus, machine learning model requires the processes of strictly dividing training data and test data, of training, and of test. It appears the impressive performance of machine learning prediction model due to its ability to analyze a great amount of information encompassing even ‘noise’-looking ones. Therefore, it is not easy and straightforward to interpret the result, and it might be important to update training data and re-model frequently.
What problem does this paper attempt to address?