Phases, Modalities, Temporal and Spatial Locality: Domain Specific ML Prefetcher for Accelerating Graph Analytics

Pengmiao Zhang,Rajgopal Kannan,Viktor K. Prasanna
2023-09-25
Abstract:Memory performance is a bottleneck in graph analytics acceleration. Existing Machine Learning (ML) prefetchers struggle with phase transitions and irregular memory accesses in graph processing. We propose MPGraph, an ML-based Prefetcher for Graph analytics using domain specific models. MPGraph introduces three novel optimizations: soft detection for phase transitions, phase-specific multi-modality models for access delta and page predictions, and chain spatio-temporal prefetching (CSTP) for prefetch control. Our transition detector achieves 34.17-82.15% higher precision compared with Kolmogorov-Smirnov Windowing and decision tree. Our predictors achieve 6.80-16.02% higher F1-score for delta and 11.68-15.41% higher accuracy-at-10 for page prediction compared with LSTM and vanilla attention models. Using CSTP, MPGraph achieves 12.52-21.23% IPC improvement, outperforming state-of-the-art non-ML prefetcher BO by 7.58-12.03% and ML-based prefetchers Voyager and TransFetch by 3.27-4.58%. For practical implementation, we demonstrate MPGraph using compressed models with reduced latency shows significantly superior accuracy and coverage compared with BO, leading to 3.58% higher IPC improvement.
Machine Learning,Hardware Architecture
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is **the memory performance bottleneck in graph analysis**, especially the inefficient memory utilization problems caused by irregular memory access patterns and phase transitions. Existing machine - learning (ML) prefetchers perform poorly in handling these challenges in graph processing, so a new method is required to improve the memory performance of graph analysis. ### Specific problem description: 1. **Complexity of memory access patterns**: Graph analysis applications exhibit different memory access patterns in different phases, which makes it difficult to train a general - purpose ML model. 2. **Randomness and irregularity of parallel execution**: Parallel execution in multi - core systems introduces randomness and irregularity, reducing the effectiveness of prefetchers based on temporal locality. 3. **Widespread jumps in cross - page access**: Storing connected nodes on multiple pages will lead to widespread page jumps, making prefetchers based on spatial locality less effective. ### Solution: To solve the above problems, the authors propose **MPGraph**, a machine - learning - based prefetcherspecially designed for graph analysis. MPGraph addresses these challenges through the following three optimizations: 1. **Phase optimization**: Introduce a soft - detection mechanism to identify phase transitions and train specific multi - modal models for each phase to improve prediction accuracy. 2. **Modal optimization**: Propose a new method based on multi - modal attention fusion (AMMA), combine address input and program counter (PC) input, and use the attention mechanism for memory access prediction. 3. **Locality optimization**: Propose a chained spatio - temporal prefetching strategy (CSTP), combine temporal and spatial locality, and predict future memory pages and multiple increments within pages. ### Main contributions: - **Domain - specific ML model development method**: Analyze the features in the architecture and computational context and apply them to develop prefetchers for graph analysis. - **MPGraph**: The first ML prefetcherspecially designed for graph analysis, optimizing phases, modalities, and locality. - **High - precision phase - transition detector**: Use a soft - detection scheme, which improves the detection accuracy by 34.17% to 82.15% compared to existing methods. - **AMMA network**: A multi - modal attention network, with an F1 score in delta prediction 6.80% to 16.02% higher than LSTM and vanilla attention, and an accuracy - at - 10 in page prediction 11.68% to 15.41% higher than existing methods. - **CSTP prefetching strategy**: Utilize spatial increment prediction and temporal page prediction, making MPGraph outperform existing non - ML prefetchers and ML prefetchers in terms of IPC improvement. - **Practical optimization techniques**: Reduce storage and inference latency through model compression, and even the most compressed model significantly outperforms the best non - ML prefetchers. ### Summary: This paper aims to solve the memory performance bottleneck problem in graph analysis by developing an ML prefetcherspecially targeted at graph analysis. MPGraph significantly improves the memory access prediction and prefetching performance of graph analysis applications by introducing phase - specific multi - modal models and a chained spatio - temporal prefetching strategy.