Prospects for Wideband VLBI Correlation in the Cloud

Ajay Gill,Lindy Blackburn,Arash Roshanineshat,Chi-Kwan Chan,Sheperd S. Doeleman,Michael D. Johnson,Alexander W. Raymond,Jonathan Weintroub
DOI: https://doi.org/10.1088/1538-3873/ab32a8
2019-08-12
Abstract:This paper proposes a cloud architecture for the correlation of wide bandwidth VLBI data. Cloud correlation facilitates processing of entire experiments in parallel using flexibly allocated and practically unlimited compute resources. This approach offers a potential improvement over dedicated correlation clusters, which are constrained by a fixed number of installed processor nodes and playback units. Additionally, cloud storage offers an alternative to maintaining a fleet of hard-disk drives that might be utilized intermittently. We describe benchmarks of VLBI correlation using the DiFX-2.5.2 software on the Google Cloud Platform to assess cloud-based correlation performance. The number of virtual CPUs per Virtual Machine was varied to determine the optimum configuration of cloud resources. The number of stations was varied to determine the scaling of correlation time with VLBI arrays of different sizes. Data transfer rates from Google Cloud Storage to the Virtual Machines performing the correlation were also measured. We also present an example cloud correlation configuration. Current cloud service and equipment pricing data is used to compile cost estimates allowing an approximate economic comparison between cloud and cluster processing. The economic comparisons are based on cost figures which are a moving target, and are highly dependent on factors such as the utilization of cluster and media, which are a challenge to estimate. Our model suggests that shifting to the cloud is an alternative path for high data rate, low duty cycle wideband VLBI correlation that should continue to be explored. In the production phase of VLBI correlation, the cloud has the potential to significantly reduce data processing times and allow the processing of more science experiments in a given year for the petabyte-scale data sets increasingly common in both astronomy and geodesy VLBI applications.
Instrumentation and Methods for Astrophysics
What problem does this paper attempt to address?