Scheduling On-demand Data Streaming for Cyberinfrastructure Applications with Constraints of Storage and Bandwidth

Wen Zhang,Junwei Cao,Lianchen Liu,Cheng Wu
2007-01-01
Abstract:Cyberinfrastructure is proposed by the US/NSF as the new century’s infrastructure for scientific research and discovery, aiming to provide cyberenvironments to enable scientific applications onto cyberresources, e.g. high performance computers, data archives, software service, distributed communities, telescopes and observatories. This work is focused on cyberinfrastructure applications with data streaming requirements. Since volumes of data streams are usually extremely high but available bandwidth and storage are often very limit, a data streaming environment to support cyberinfrastructure applications is implemented in this work with supports of on-demand data transfers and just-in-time data cleanups. The implementation includes performance sensors, predictors and a scheduler, with real-time measurements, prediction and scheduling capabilities. The GridFTP is utilized and data transfer schedules are performed using the on-the-fly adjustable GridFTP parallelism. Experimental results show that the data streaming environment can scale well regarding storage usage and adapt to dynamically changing application data processing requirements especially with constraints of limit storage and bandwidth.
What problem does this paper attempt to address?