QTLS - high-performance TLS asynchronous offload framework with Intel® QuickAssist technology.

Xiaokang Hu,Changzheng Wei,Jian Li,Brian Will,Ping Yu,Lu Gong,Haibing Guan
DOI: https://doi.org/10.1145/3293883.3295705
2019-01-01
Abstract:Hardware accelerators are a promising solution to optimize the Total Cost of Ownership (TCO) of cloud datacenters. This paper targets the costly Transport Layer Security (TLS) and investigates the TLS acceleration for the widely-deployed event-driven TLS servers or terminators. Our study reveals an important fact: the straight offloading of TLS-involved crypto operations suffers from the frequent long-lasting blockings in the offload I/O, leading to the underutilization of both CPU and accelerator resources. To achieve efficient TLS acceleration for the event-driven web architecture, we propose QTLS, a high-performance TLS asynchronous offload framework based on Intel® QuickAssist Technology (QAT). QTLS re-engineers the TLS software stack and divides the TLS offloading into four phases to eliminate blockings. Then, multiple crypto operations from different TLS connections can be offloaded concurrently in one process/thread, bringing a performance boost. Moreover, QTLS is built with a heuristic polling scheme to retrieve accelerator responses efficiently and timely, and a kernel-bypass notification scheme to avoid expensive switches between user mode and kernel mode while delivering async events. The comprehensive evaluation shows that QTLS can provide up to 9x connections per second (CPS) with TLS-RSA (2048bit), 2x secure data transfer throughput and 85% reduction of average response time compared to the software baseline.
What problem does this paper attempt to address?