Leveraging Page-Level Compression In Mysql -A Practice At Baidu

Jingwei Ma,Boxue Yin,Zhi Kong,Yuxiang Ma,Chang Chen,Long Wang,Gang Wang,Xiaoguang Liu
DOI: https://doi.org/10.1109/TrustCom.2016.0179
2016-01-01
Abstract:Facing large scale of data sets, disk I/O seems still one of the bottlenecks in DBMS. In the mean time, the CPU resource is not fully utilized. So compression is introduced to take use of the computing resource and largely reduces the storage overhead. Also, the commonly used compression algorithm can improve the performance when the database runs on HDD. With SSD, however, the performance for both read and write could be negatively affected by the slow process of compression and decompression. By quantitatively analyzing the impact of compression, we proposed a balanced compression solution on SSD, in which the read performance is accelerated by using a compression algorithm (lz4hc) with an extreme high decompression speed and an asynchronous compression mechanism is introduced to reduce the write latency by moving compression to the background. We test the performance on the real data set collected from the online database systems in Baidu. The results show the read performance on SSD is improved by 25% compared to the uncompressed database and 36% compared with commonly used zlib compression. Meanwhile, the write performance is up to 20% and 33% better than the synchronous compression on lz4hc and zlib.
What problem does this paper attempt to address?