MRData: A MapReduce-Based Tool for Heterogeneous Data Integration

Liutong Xu,Kai Jin,Hongqiao Tian
DOI: https://doi.org/10.1109/isme.2010.252
2010-01-01
Abstract:As the volume of data increasing sharply and the relationship among different data sources becoming intricately, how to integrate mass data sources and how to find latent information from the integrated data is a matter of urgency. At present, industry tends to adopt distributed computing model to solve the integration of massive data. Aiming at getting the valuable and in-depth information, visualization is a critical step in data analysis and data mining. We design a tool called MRData for heterogeneous data integration which has two features: 1) parallel data processing based on Hadoop which is a distributed platform; 2) visual analysis. And at last, experiments verify the efficiency of MRData.
What problem does this paper attempt to address?