AI Matrix: A Deep Learning Benchmark for Alibaba Data Centers

Wei Zhang,Wei Wei,Lingjie Xu,Lingling Jin,Cheng Li
DOI: https://doi.org/10.48550/arXiv.1909.10562
2019-09-24
Abstract:Alibaba has China's largest e-commerce platform. To support its diverse businesses, Alibaba has its own large-scale data centers providing the computing foundation for a wide variety of software applications. Among these applications, deep learning (DL) has been playing an important role in delivering services like image recognition, objection detection, text recognition, recommendation, and language processing. To build more efficient data centers that deliver higher performance for these DL applications, it is important to understand their computational needs and use that information to guide the design of future computing infrastructure. An effective way to achieve this is through benchmarks that can fully represent Alibaba's DL applications.
Machine Learning,Performance
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the mismatch between the computational requirements of deep learning (DL) applications and the infrastructure design in Alibaba's data centers. Specifically, existing deep - learning benchmarking tools such as MLPerf, DeepBench, DAWNBench, etc., cannot fully represent the unique deep - learning workloads in Alibaba's e - commerce environment. These existing tools are either too general, or their model collections are obsolete and narrow, or they focus on performance testing for specific tasks and cannot comprehensively reflect the characteristics of Alibaba's DL applications. To solve this problem, the paper introduces Alibaba's self - developed deep - learning benchmarking suite - AI Matrix. AI Matrix aims to comprehensively cover typical DL applications involving more than 90% of GPU usage in Alibaba's data centers, including three major categories: computer vision, recommendation systems, and language processing. Through high coverage and high similarity to actual applications, AI Matrix can fully represent the DL workloads in Alibaba's data centers, thereby providing guidance for future hardware selection, performance bottleneck identification and optimization of software and hardware systems. In addition, most of the benchmarks in AI Matrix (17 out of 20) are open to the public to promote the development of hardware suppliers, the industry, and research institutions.