Assessing Intel OneAPI capabilities and cloud-performance for heterogeneous computing

Silvia R. Alcaraz,Ruben Laso,Oscar G. Lorenzo,David L. Vilariño,Tomás F. Pena,Francisco F. Rivera
DOI: https://doi.org/10.1007/s11227-024-05958-5
IF: 3.3
2024-03-05
The Journal of Supercomputing
Abstract:This work presents a performance-oriented study of a heterogeneous application developed with Intel OneAPI to solve two well-known diffusion problems: heat diffusion and image denoising. We have explored CPU+iGPU and CPU+FPGA schemes, applying dynamic load balancing and conducting experiments on Intel DevCloud. The results demonstrate that the CPU+iGPU scheme outperforms the execution times achieved by the fastest device when the problem is sufficiently computationally demanding. We also found that the performance of the CPU+FPGA scheme is heavily affected by bandwidth limitations and specific strategies to manage memory efficiently are required. Moreover, it was demonstrated that dynamic workload balancing is crucial due to possible performance fluctuations in any of the implicated devices. In conclusion, Intel OneAPI provides a helpful tool for multi-platform development using a unique high-level language, DPC++. However, developing specific code for each platform is necessary to achieve optimal performance.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?
The paper primarily explores performance evaluation and cloud performance research in heterogeneous computing environments using Intel OneAPI tools. Specifically, the authors developed a heterogeneous application to address two well-known diffusion problems: heat diffusion and image denoising. They explored two schemes, CPU+iGPU (integrated graphics processor) and CPU+FPGA (field-programmable gate array), and conducted experiments on Intel DevCloud. The main research contents include: 1. **Performance Evaluation of Heterogeneous Computing Schemes**: The paper provides a detailed introduction to the performance test results of the two heterogeneous computing schemes, CPU+iGPU and CPU+FPGA. By using dynamic load balancing strategies, tasks are allocated among different devices to achieve optimal performance. 2. **Evaluation of Intel OneAPI Tools**: The researchers used Intel OneAPI tools for development, which allow the creation of multi-platform code using a single high-level language (DPC++). They evaluated the tool's ease of use, performance portability, and throughput performance in practical applications. 3. **Case Studies**: The paper selected two diffusion problems as case studies, namely heat diffusion and image denoising. By solving these problems, the researchers were able to deeply analyze the performance characteristics under different hardware configurations. 4. **Application of Dynamic Load Balancing Strategies**: To optimize overall performance, the study adopted dynamic load balancing strategies. These strategies can automatically adjust task allocation among different devices, thereby better utilizing the advantages of each device. 5. **Experimental Environment Setup**: All experiments were conducted on Intel DevCloud, a cloud computing environment provided by Intel, which includes various models of CPU, GPU, and FPGA devices. The main contribution of the paper lies in the comprehensive evaluation of Intel OneAPI tools and the comparison of the performance of the two heterogeneous computing schemes, CPU+iGPU and CPU+FPGA, in solving specific problems. Additionally, the study emphasizes the importance of dynamic load balancing in improving the overall performance of heterogeneous systems.