Finding the Optimal Execution Scheme of External Mergesort on Solid State Drives

Chen Yubiao,Li Jianzhong,Gao Hong
DOI: https://doi.org/10.1007/s11280-021-00872-9
2021-01-01
World Wide Web
Abstract:As the flash-based solid-state drives(SSDs) gradually replace the mechanical hard disk drives(HDDs) as the mainstream storage, unlike the HDDs, SSDs have rich internal parallelism, which makes it have the excellent characteristics that HDDs do not have. External mergesort, as the classical algorithm of external sorting adopted in many systems and algorithms, has an important impact on the overall performance. Therefore, it is of great significance to optimize and improve the efficiency of external mergesort algorithm. The research work on optimizing raw external mergesort algorithm on SSDs is relatively few. Thus, aiming at the external mergesort problem, based on the characteristics of SSDs, this paper proposes the SortDecision algorithm which can calculate its optimal execution scheme, including merging way, read buffer size, and write buffer size which determine the execution process of external mergesort. Exploiting the above optimal execution scheme, external mergesort can obtain better efficiency. In the SortDecision algorithm, external mergesort problem on SSDs is formalized and transformed into a piecewise convex optimization problem. Then, the optimal external mergesort scheme is obtained by enumerating the solutions of each subconvex problem. The experimental results show that the external mergesort proceeds guided by SortDecision algorithm can achieve a speedup of 1 $\sim $ 6.7 compared to the traditional external mergesort algorithm in the case of limited memory provided. The richer the internal parallelism resources inside SSDs, the better the effect of SortDecision’s acceleration.
What problem does this paper attempt to address?