VH-DSI: Speeding up Data Visualization via a Heterogeneous Distributed Storage Infrastructure

Yongqing Zhu,J. Samsudin,Haixiang Shi,Jun Wang
DOI: https://doi.org/10.1109/ICPADS.2015.88
2015-12-14
Abstract:Visualizing and analyzing large-scale datasets are both critical and challenging, as they require substantial resources for data processing and storage. While the speed of supercomputers continues to set higher standard, the I/O systems have not kept in pace, resulting in a significant performance bottleneck. To alleviate the I/O bottleneck for scientific visualization applications, we propose a Visualization via a Heterogeneous Distributed Storage Infrastructure (VH-DSI) solution to improve I/O speed and accelerate overall visualization performance. VH-DSI replaces the traditional parallel file system with a distributed file system to support visualization applications. A new scheduling algorithm HeterSche is proposed in VH-DSI to assign computing tasks to data nodes with the consideration of cluster heterogeneity and data locality. VH-DSI also includes a design to support POSIX-IO for distributed file system. The performance evaluation has shown that the proposed VH-DSI solution can achieve significant performance improvement for visualization applications. Compared to the traditional visualization, the VH-DSI solution reduces the response time by at least 5 times. The HeterSche scheduling algorithm is capable to speed up visualization compared to other scheduling algorithms especially for large scale datasets.
What problem does this paper attempt to address?