Abstract:Range locks are a synchronization construct designed to provide concurrent access to multiple threads (or processes) to disjoint parts of a shared resource. Originally conceived in the file system context, range locks are gaining increasing interest in the Linux kernel community seeking to alleviate bottlenecks in the virtual memory management subsystem. The existing implementation of range locks in the kernel, however, uses an internal spin lock to protect the underlying tree structure that keeps track of acquired and requested ranges. This spin lock becomes a point of contention on its own when the range lock is frequently acquired. Furthermore, where and exactly how specific (refined) ranges can be locked remains an open question. In this paper, we make two independent, but related contributions. First, we propose an alternative approach for building range locks based on linked lists. The lists are easy to maintain in a lock-less fashion, and in fact, our range locks do not use any internal locks in the common case. Second, we show how the range of the lock can be refined in the mprotect operation through a speculative mechanism. This refinement, in turn, allows concurrent execution of mprotect operations on non-overlapping memory regions. We implement our new algorithms and demonstrate their effectiveness in user-space and kernel-space, achieving up to 9$\times$ speedup compared to the stock version of the Linux kernel. Beyond the virtual memory management subsystem, we discuss other applications of range locks in parallel software. As a concrete example, we show how range locks can be used to facilitate the design of scalable concurrent data structures, such as skip lists.

Exploiting Fine-Grained Parallelism in Graph Traversal Algorithms Via Lock Virtualization on Multi-Core Architecture

Vlock: Lock Virtualization Mechanism for Exploiting Fine-Grained Parallelism in Graph Traversal Algorithms

Fine-Grained Parallel Betweenness Centrality Algorithm Without Lock Synchronization

Tuning the granularity of parallelism for distributed graph processing

Comparison of Lock Thrashing Avoidance Methods and Its Performance Implications for Lock Design

Understanding Parallelism in Graph Traversal on Multi-Core Clusters

Reducing Scalability Collapse Via Requester-Based Locking on Multicore Systems

A Topology-Aware Framework for Graph Traversals.

Parallelizing Sequential Network Applications with Customized Lock-Free Data Structures

A Topology-Adaptive Strategy for Graph Traversing

A multithreaded parallel Delaunay triangulation algorithm based on lock-free atomic operations

Lock-Visor: An Efficient Transitory Co-scheduling for MP Guest

Inferring Lockstep Behavior from Connectivity Pattern in Large Graphs

Real-Time Scheduling of Parallel Task Graphs With Critical Sections Across Different Vertices

Plock: A Fast Lock for Architectures with Explicit Inter-core Message Passing.

Protecting Synchronization Mechanisms of Parallel Big Data Kernels via Logging

ANOLE: A Profiling-Driven Adaptive Lock Waiter Detection Scheme for Efficient MP-guest Scheduling

Requester-Based Spin Lock: A Scalable and Energy Efficient Locking Scheme on Multicore Systems

On the Analysis of Parallel Real-Time Tasks With Spin Locks

A Scheduling Method for Avoiding Kernel Lock Thrashing on Multi-cores

Scalable Range Locks for Scalable Address Spaces and Beyond