A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Millán A. Martínez,Basilio B. Fraguela,José C. Cabaleiro,Francisco F. Rivera

DOI: https://doi.org/10.1007/s11227-024-05987-0

IF: 3.3

2024-03-13

The Journal of Supercomputing

Abstract:Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular loops with indirections, race conditions or potential data dependency violations, among many other possible causes. One of the successful ways to automatically parallelize these loops is the use of speculative parallelization techniques. This paper presents a new model and the corresponding C++ library that supports the speculative automatic parallelization of loops in shared memory systems, seeking competitive performance and scalability while keeping user effort to a minimum. The primary speculative strategy consists of redundantly executing chunks of loop iterations in a duplicate fashion. Namely, each chunk is executed speculatively in parallel to obtain results as soon as possible and sequentially in a different thread to validate the speculative results. The implementation uses C++11 threads and it makes intensive use of templates and advanced multithreading techniques. An evaluation based on various benchmarks confirms that our proposal provides a competitive level of performance and scalability.

computer science, theory & methods,engineering, electrical & electronic, hardware & architecture

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to address the challenges of automatic code parallelization in current multi-core processor systems. Specifically, the paper proposes a new thread-level speculative automatic parallelization model based on repeated code execution and a corresponding C++ library (called SpecLib) to improve parallel performance and scalability while minimizing the user's programming workload. **Main Issues:** 1. **Parallelization Difficulty**: As the number of processor cores increases, manually parallelizing code becomes increasingly complex and error-prone. 2. **Insufficient Existing Tools**: Existing automatic parallelization tools cannot effectively handle irregular loops with indirect dependencies, race conditions, or potential data dependency violations. 3. **Performance Optimization Needs**: There is a need for an efficient method to automatically parallelize these loops and ensure the validity and consistency of the results. The paper addresses these issues by introducing a new speculative parallelization technique that employs repeated execution of loop iteration blocks, performing speculative execution and validation execution in different threads simultaneously. This approach aims to obtain results as quickly as possible while maintaining correctness. This method can be applied to almost any loop, even if they are unanalyzable at compile time or have potential dependencies.

A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Speculative Parallelization Using State Separation and Multiple Value Prediction.

Speculative Parallelization of Sequential Loops on Multicores

Speculative parallelization on multicore processors

Copy or Discard Execution Model for Speculative Parallelization on Multicores

Supporting Speculative Parallelization in the Presence of Dynamic Data Structures

Mis-speculation-Driven Compiler Framework for Aggressive Loop Automatic Parallelization

The Potential of Synergistic Static, Dynamic and Speculative Loop Nest Optimizations for Automatic Parallelization

A Language of Suggestions for Program Parallelization

Automatic parallelization of fine-grained metafunctions on a chip multiprocessor

Position-aware Thread-Level Speculative Parallelization for Large-Scale Chip-Multiprocessor.

Using heuristic value prediction and dynamic task granularity resizing to improve software speculation.

Enhanced Speculative Parallelization Via Incremental Recovery

HEUSPEC: A Software Speculation Parallel Model

Reverse Compilation for Speculative Parallel Threading

Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime

Exploring Fine-grained Task Parallelism on Simultaneous Multithreading Cores

Characterizing Fine-Grain Parallelism on Modern Multicore Platform

Improving Speculation Accuracy with Inter-thread Fetching Value Prediction.

Potential Thread-Level-parallelism Exploration with Superblock Reordering

Parallelizing Sequential Network Applications with Customized Lock-Free Data Structures