Ditto: Elastic Confidential VMs with Secure and Dynamic CPU Scaling

Shixuan Zhao,Mengyuan Li,Mengjia Yan,Zhiqiang Lin
2024-09-24
Abstract:Confidential Virtual Machines (CVMs) are a type of VMbased Trusted Execution Environments (TEEs) designed to enhance the security of cloud-based VMs, safeguarding them even from malicious hypervisors. Although CVMs have been widely adopted by major cloud service providers, current CVM designs face significant challenges in runtime resource management due to their fixed capacities and lack of transparency. These limitations hamper efficient cloud resource management, leading to increased operational costs and reduced agility in responding to fluctuating workloads. This paper introduces a dynamic CPU resource management approach, featuring the novel concept of "Elastic CVM. This approach allows for hypervisor-assisted runtime adjustment of CPU resources using a specialized vCPU type, termed Worker vCPU. This new approach enhances CPU resource adaptability and operational efficiency without compromising security. Additionally, we introduce a Worker vCPU Abstraction Layer to simplify Worker vCPU deployment and management. To demonstrate the effectiveness of our approach, we have designed and implemented a serverless computing prototype platform, called Ditto. We show that Ditto significantly improves performance and efficiency through finergrain resource management. The concept of "Elastic CVM" and the Worker vCPU design not only optimize cloud resource utilization but also pave the way for more flexible and cost-effective confidential computing environments.
Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to achieve dynamic and secure resource management in Confidential Virtual Machines (CVMs) to adapt to the rapidly changing workload requirements. Specifically, the existing CVM designs face significant challenges in runtime resource management due to their fixed capacity and lack of transparency, resulting in inefficient cloud resource management, increased operating costs, and insufficient ability to handle fluctuating workloads. ### Problem Background Confidential Virtual Machines (CVMs) are a type of virtual - machine - based Trusted Execution Environments (TEEs) that aim to enhance the security of virtual machines in the cloud environment and protect the confidentiality and integrity of data even in the face of malicious hypervisors. Although CVM technology has been widely adopted by major cloud service providers, the current CVM designs have significant limitations in runtime resource management: 1. **Fixed Capacity**: The capacity of a CVM is usually set at initialization and remains unchanged during operation. 2. **Lack of Transparency**: The operations of a CVM are deliberately kept as a "black box" during runtime, increasing the complexity of resource management. 3. **Slow Remote Verification**: Starting a new CVM requires complex remote verification, which makes rapid expansion difficult. 4. **vCPU State Protection**: The state and number of vCPUs are strictly protected by hardware during runtime, increasing the management latency and complexity. 5. **Slow Live Migration**: Due to the need for CPU hardware verification and inspection, the live migration of CVMs becomes more costly and time - consuming. These limitations make it difficult for CVMs to meet the requirements of agile resource management in modern cloud computing, especially in scenarios where the workload intensity fluctuates greatly. ### Solution To overcome these problems, the paper proposes the concept of "Elastic CVM" and introduces a new type of vCPU called "Worker vCPU". The specific solutions include: 1. **Elastic CVM**: Dynamically adjust the CPU resources allocated to the CVM through the coordination mechanism between the CVM and the hypervisor, thereby improving the adaptability and efficiency of the CVM. 2. **Worker vCPU Design**: The Worker vCPU can remain dormant during low workloads, saving physical CPU resources; when the workload increases, it can be quickly activated to ensure efficient resource management without sacrificing security. 3. **Worker vCPU Abstraction Layer**: Simplify the deployment and management of Worker vCPUs, provide clear application selection principles and necessary software components to ensure secure and efficient deployment and management. In addition, the paper also develops a server - side computing prototype platform named DITTO, which shows how to use the Worker vCPU design to achieve rapid auto - scaling in a serverless environment, significantly improving performance and resource utilization. ### Summary The core problem of this paper is: How to achieve dynamic and secure resource management in confidential virtual machines to adapt to the rapidly changing workload requirements. By introducing the elastic CVM and Worker vCPU design, the paper proposes an innovative solution, solves the key problems in the existing CVM designs, and improves the efficiency and flexibility of cloud resource management.