A novel in-depth analysis approach for domain-specific problems based on multidomain data

Jia Zhao,Yue Zhang,Yan Ding,Qiuye Yu,Ming Hu
DOI: https://doi.org/10.1016/j.ins.2021.12.013
IF: 8.1
2022-04-01
Information Sciences
Abstract:When addressing analysis and prediction problems in a specific domain based on big data processing, the following problems often arise: only relationships between features in the domain itself are considered, and existing methods are not effective for training models on small sample data sets. The traditional approach usually obtains the relationships between single-domain features. Analysis and forecasting in the problem domain alone can quickly achieve good accuracy, but due to the limitations of the analysis domain, it becomes increasingly difficult to further improve the prediction accuracy. This paper proposes a novel data analysis approach compatible with small sample sets called multidomain data depth analysis (MODE). In contrast to traditional approaches, MODE emphasizes multidomain data and considers the relationships among feature domains in the original data. The features in each domain are orthogonally extracted, and feature dimensions are expanded in accordance with the characteristics of small data sets. A better prediction model can be obtained by using the expanded and strengthened features. We apply this approach to real big data from the field of sociology to predict annual income based on census data in experiments. The experimental results show that MODE offers a better prediction effect based on small multidomain samples.
computer science, information systems
What problem does this paper attempt to address?