Reclaiming memory for lock-free data structures: there has to be a better way
Trevor Brown
DOI: https://doi.org/10.48550/arXiv.1712.01044
2017-12-04
Distributed, Parallel, and Cluster Computing
Abstract:Memory reclamation for lock-based data structures is typically easy. However, it is a significant challenge for lock-free data structures. Automatic techniques such as garbage collection are inefficient or use locks, and non-automatic techniques either have high overhead, or do not work for many data structures. For example, subtle problems can arise when hazard pointers, one of the most common non-automatic techniques, are applied to many lock-free data structures. Epoch based reclamation (EBR), which is by far the most efficient non-automatic technique, allows the number of unreclaimed objects to grow without bound, because one crashed process can prevent all other processes from reclaiming memory. We develop a more efficient, distributed variant of EBR that solves this problem. It is based on signaling, which is provided by many operating systems, such as Linux and UNIX. Our new scheme takes $O(1)$ amortized steps per high-level operation on the data structure and $O(1)$ steps in the worst case each time an object is removed from the data structure. At any point, $O(mn^2)$ objects are waiting to be freed, where $n$ is the number of processes and $m$ is a small constant for most data structures. Experiments show that our scheme has very low overhead: on average 10\%, and at worst 28\%, for a balanced binary search tree over many thread counts, operation mixes and contention levels. Our scheme also outperforms a highly tuned implementation of hazard pointers by an average of 75\%. Typically, memory reclamation is tightly woven into lock-free data structure code. To improve modularity and facilitate the comparison of different memory reclamation schemes, we also introduce a highly flexible abstraction. It allows a programmer to easily interchange schemes for reclamation, object pooling, allocation and deallocation with virtually no overhead, by changing a single line of code.