Exoshuffle-CloudSort

Frank Sifei Luan,Stephanie Wang,Samyukta Yagati,Sean Kim,Kenneth Lien,Isaac Ong,Tony Hong,SangBin Cho,Eric Liang,Ion Stoica
DOI: https://doi.org/10.48550/arXiv.2301.03734
2023-01-10
Abstract:We present Exoshuffle-CloudSort, a sorting application running on top of Ray using the Exoshuffle architecture. Exoshuffle-CloudSort runs on Amazon EC2, with input and output data stored on Amazon S3. Using 40 i4i.4xlarge workers, Exoshuffle-CloudSort completes the 100 TB CloudSort Benchmark (Indy category) in 5378 seconds, with an average total cost of $97.
Distributed, Parallel, and Cluster Computing,Operating Systems
What problem does this paper attempt to address?