Abstract:Data race is one of the most important concurrent anomalies in multi-threaded programs. Emerging constraint- based techniques are leveraged into race detection, which is able to find all the races that can be found by any other sound race detector. However, this constraint-based approach has serious limitations on helping programmers analyze and understand data races. First, it may report a large number of false positives due to the unrecognized dataflow propagation of the program. Second, it recommends a wide range of thread context switches to schedule the reported race (including the false one) whenever this race is exposed during the constraint-solving process. This ad hoc recommendation imposes too many context switches, which complicates the data race analysis. To address these two limitations in the state-of-the-art constraint-based race detection, this paper proposes DFTracker, an improved constraint-based race detector to recommend each data race with minimal thread context switches. Specifically, we reduce the false positives by analyzing and tracking the dataflow in the program. By this means, DFTracker thus reduces the unnecessary analysis of false race schedules. We further propose a novel algorithm to recommend an effective race schedule with minimal thread context switches for each data race. Our experimental results on the real applications demonstrate that 1) without removing any true data race, DFTracker effectively prunes false positives by 68% in comparison with the state-of-the-art constraint-based race detector; 2) DFTracker recommends as low as 2.6–8.3 (4.7 on average) thread context switches per data race in the real world, which is 81.6% fewer context switches per data race than the state-of-the-art constraint based race detector. Therefore, DFTracker can be used as an effective tool to understand the data race for programmers.

DataRaceBench V1.4.1 and DataRaceBench-ML V0.1: Benchmark Suites for Data Race Detection

Jbench: a Dataset of Data Races for Concurrency Testing

HardRace: A Dynamic Data Race Monitor for Production Use

Data race detection via few-shot parameter-efficient fine-tuning

DRDDR: a Lightweight Method to Detect Data Races in Linux Kernel

TDDBench: A Benchmark for Training data detection

DataPerf: Benchmarks for Data-Centric AI Development

PredRacer: Predictively Detecting Data Races in Android Applications

Detecting Data Races Caused by Inconsistent Lock Protection in Device Drivers

PMLBmini: A Tabular Classification Benchmark Suite for Data-Scarce Applications

Enriching the Machine Learning Workloads in BigBench

Racez: A Lightweight And Non-Invasive Race Detection Tool For Production Applications

PMLB: a large benchmark suite for machine learning evaluation and comparison

Minimal Context-Switching Data Race Detection with Dataflow Tracking

Managing concurrent testing of data race with ComRaDe.

Open-Source Drift Detection Tools in Action: Insights from Two Use Cases

A Comprehensive Benchmark of Machine and Deep Learning Across Diverse Tabular Datasets

DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime

MLBench: How Good Are Machine Learning Clouds for Binary Classification Tasks on Structured Data?

Training Data Debugging for the Fairness of Machine Learning Software

Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models