Optimizing NVMe Storage for Large-scale Deployment: Key Technologies and Strategies in Alibaba Cloud

Yiquan Chen,Yuan Xie,Yijing Wang,Jiexiong Xu,Zhen Jin,Anyu Li,Xiaoyan Fu,Qiang Liu,Wenzhi Chen
DOI: https://doi.org/10.1109/mm.2024.3426514
IF: 2.8212
2024-01-01
IEEE Micro
Abstract:Non-volatile Memory Express (NVMe) storage has the advantage of ultraperformance. It has gained widespread adoption within cloud data centers, and its significance is only growing. Despite its widespread use, it still faces numerous challenges, such as optimizing performance, managing cost-effectiveness, and ensuring reliability for large-scale deployments. At Alibaba Cloud, we have designed numerous software-hardware co-design techniques tailored for NVMe storage deployment. These include hardware-assisted NVMe virtualization techniques developed explicitly for bare metal and virtual machines (VMs). BM-Store is a novel hardware-assisted virtualization architecture for bare-metal instances. Furthermore, the Cloud Infrastructure Processing Unit incorporates an embedded virtualization acceleration unit for a VM. LightPool, an NVMe over Fabrics-based high-performance storage pool architecture, enhances resource utilization for cloud-native distributed databases. Additionally, we delve into the technical challenges and opportunities presented by NVMe storage in the realm of serverless computing and artificial intelligence.
What problem does this paper attempt to address?