BDViewer — A Web-Based Big Data Processing and Visualization Tool

Yan Li,Junming Ma,Bo An,Donggang Cao
DOI: https://doi.org/10.1109/COMPSAC.2018.00080
2018-01-01
Abstract:The size of data sets being collected and analyzed in data science field is growing rapidly, making traditional big data processing solution prohibitively expensive. Especially when the data sets are too large, distributed techniques are inevitable even for simple embarrassing parallel jobs. However, distributed computing is still inaccessible to a large number of users. For example, many average users are still struggling with complex cluster management and configuration tools[24] even just for summing up a group of numbers in a large data file. In this paper, we present BDViewer, a web-based big data processing and visualizing tool. BDViewer uses JavaScript plugins to enable users to view, process and visualize their large data files just through a web browser. By just clicking a button, users can open a large data file online and view the file contents immediately no matter how large the file is. In the back-end, BDViewer is built on a virtual private cloud system. Users' operations in a web browser are converted into map-reduce jobs and MPI tasks that are executed on the cloud. At the end of this paper, some experiments are carried out, which demonstrate BDViewer's effectiveness and ease of use.
What problem does this paper attempt to address?