Abstract:Lock-free concurrent algorithms guarantee that some concurrent operation will always make progress in a finite number of steps. Yet programmers prefer to treat concurrent code as if it were wait-free, guaranteeing that all operations always make progress. Unfortunately, designing wait-free algorithms is generally a very complex task, and the resulting algorithms are not always efficient. While obtaining efficient wait-free algorithms has been a long-time goal for the theory community, most non-blocking commercial code is only lock-free. This paper suggests a simple solution to this problem. We show that, for a large class of lock- free algorithms, under scheduling conditions which approximate those found in commercial hardware architectures, lock-free algorithms behave as if they are wait-free. In other words, programmers can keep on designing simple lock-free algorithms instead of complex wait-free ones, and in practice, they will get wait-free progress. Our main contribution is a new way of analyzing a general class of lock-free algorithms under a stochastic scheduler. Our analysis relates the individual performance of processes with the global performance of the system using Markov chain lifting between a complex per-process chain and a simpler system progress chain. We show that lock-free algorithms are not only wait-free with probability 1, but that in fact a general subset of lock-free algorithms can be closely bounded in terms of the average number of steps required until an operation completes. To the best of our knowledge, this is the first attempt to analyze progress conditions, typically stated in relation to a worst case adversary, in a stochastic model capturing their expected asymptotic behavior.

Vlock: Lock Virtualization Mechanism for Exploiting Fine-Grained Parallelism in Graph Traversal Algorithms

Exploiting Fine-Grained Parallelism in Graph Traversal Algorithms Via Lock Virtualization on Multi-Core Architecture

Fine-Grained Parallel Betweenness Centrality Algorithm Without Lock Synchronization

Plock: A Fast Lock for Architectures with Explicit Inter-core Message Passing.

Lock-Visor: An Efficient Transitory Co-scheduling for MP Guest

Reducing Scalability Collapse Via Requester-Based Locking on Multicore Systems

Comparison of Lock Thrashing Avoidance Methods and Its Performance Implications for Lock Design

On the Analysis of Parallel Real-Time Tasks With Spin Locks

Scalable Range Locks for Scalable Address Spaces and Beyond

ANOLE: A Profiling-Driven Adaptive Lock Waiter Detection Scheme for Efficient MP-guest Scheduling

Tuning the granularity of parallelism for distributed graph processing

Real-Time Scheduling of Parallel Task Graphs With Critical Sections Across Different Vertices

A Unified Blocking Analysis for Parallel Tasks With Spin Locks Under Global Fixed Priority Scheduling

A multithreaded parallel Delaunay triangulation algorithm based on lock-free atomic operations

Parallelizing Sequential Network Applications with Customized Lock-Free Data Structures

Are Lock-Free Concurrent Algorithms Practically Wait-Free?

Protecting Synchronization Mechanisms of Parallel Big Data Kernels via Logging

Requester-Based Spin Lock: A Scalable and Energy Efficient Locking Scheme on Multicore Systems

HaLock: Hardware-assisted lock contention detection in multithreaded applications

Libfork: portable continuation-stealing with stackless coroutines

Protecting Locks Against Unbalanced Unlock()