HPC AI500: Representative, Repeatable and Simple HPC AI Benchmarking

Zihan Jiang,Wanling Gao,Fei Tang,Xingwang Xiong,Lei Wang,Chuanxin Lan,Chunjie Luo,Hongxiao Li,Jianfeng Zhan
DOI: https://doi.org/10.48550/arXiv.2102.12848
2021-02-25
Performance
Abstract:Recent years witness a trend of applying large-scale distributed deep learning algorithms (HPC AI) in both business and scientific computing areas, whose goal is to speed up the training time to achieve a state-of-the-art quality. The HPC AI benchmarks accelerate the process. Unfortunately, benchmarking HPC AI systems at scale raises serious challenges. This paper presents a representative, repeatable and simple HPC AI benchmarking methodology. Among the seventeen AI workloads of AIBench Training -- by far the most comprehensive AI Training benchmarks suite -- we choose two representative and repeatable AI workloads. The selected HPC AI benchmarks include both business and scientific computing: Image Classification and Extreme Weather Analytics. To rank HPC AI systems, we present a new metric named Valid FLOPS, emphasizing both throughput performance and a target quality. The specification, source code, datasets, and HPC AI500 ranking numbers are publicly available from \url{https://www.benchcouncil.org/HPCAI500/}.
What problem does this paper attempt to address?