Graph Partitioning in the Implementation and Performance Optimization of A Hybrid Memory System 图划分在混合内存系统的实现与性能优化

Qi Li,Jiang Zhong,Xue Li
DOI: https://doi.org/10.11897/SP.J.1016.2019.02481
2019-11-01
Abstract:The tremendous scale of modern graph datasets has rapidly increased the demand for efficient algorithms for graph analysis. A standard approach distributes the graph over a cluster of computer nodes, but the performing computations on a distributed graph is expensive if the large amount of data has to be moved. Graph partitioning is the precondition of distributed graph computing framework, which is a key problem in improving the performance of distributed graph computing. Streaming graph partitioning is more efficient compared with offline partitioning, it has been developed continuously in the application of graph partitioning in recent years. Because of the limitation of memory capacity, the single commodity type computer is difficult to partition and optimize the massive graphs. Existing methods mainly use distributed cluster to process these large graph partitioning, while distributed computational resources have become more accessible, developing distributed graph partitioning algorithms still remains challenging, especially to non-experts. NVM storage has the advantages of low power consumption, high density, low latency and byte-addressable, which is the construction of high performance storage system and an important means to improve the performance big data system. But NVM storage also has some disadvantages, such as writing power consumption is higher than reading, write latency is higher than read, and the write counts of NVM is limited. The asymmetry problem about read and write in NVM should be paid more attention when using the hybrid NVM and DRAM memory. In this work, we explore to partition the large graphs under single compute node with hybrid NVM and DRAM memory. We propose a management strategy based on adjacent edge structure for dynamic cached data (AeFdy). This strategy converts the cached data structure from the adjacent vertex structure in the original streaming algorithms to the adjacent edge structure. The experimental results on 7 real-world graphs show that the average partitioning time of the new method is 4.9 times faster than that of original method. At the same time, the strategy evaluates the data pages in NVM and DRAM media by different models according to the characteristics of the streaming algorithm based on adjacent edge structure, places data pages in different memory media to reduce system migration operation times and improve the system performance. To demonstrate, the AeFdy strategy is simulated in the Linux kernel. Compared with the existing hybrid memory management strategy, such as Linux Swap, M-CLOCK and Dr.Swap, AeFdy improves the system performance by 128.5%, 87.4% and 50.4% respectively.
Computer Science
What problem does this paper attempt to address?