Assessing opportunities of SYCL for biological sequence alignment on GPU-based systems

Manuel Costanzo,Enzo Rucci,Carlos García-Sanchez,Marcelo Naiouf,Manuel Prieto-Matías
DOI: https://doi.org/10.1007/s11227-024-05907-2
IF: 3.3
2024-02-19
The Journal of Supercomputing
Abstract:Abstract Bioinformatics and computational biology are two fields that have been exploiting GPUs for more than two decades, with being CUDA the most used programming language for them. However, as CUDA is an NVIDIA proprietary language, it implies a strong portability restriction to a wide range of heterogeneous architectures, like AMD or Intel GPUs. To face this issue, the Khronos group has recently proposed the SYCL standard, which is an open, royalty-free, cross-platform abstraction layer that enables the programming of a heterogeneous system to be written using standard, single-source C++ code. Over the past few years, several implementations of this SYCL standard have emerged, being oneAPI the one from Intel. This paper presents the migration process of the SW # suite, a biological sequence alignment tool developed in CUDA, to SYCL using Intel’s oneAPI ecosystem. The experimental results show that SW # was completely migrated with a small programmer intervention in terms of hand-coding. In addition, it was possible to port the migrated code between different architectures (considering multiple vendor GPUs and also CPUs), with no noticeable performance degradation on five different NVIDIA GPUs. Moreover, performance remained stable when switching to another SYCL implementation. As a consequence, SYCL and its implementations can offer attractive opportunities for the bioinformatics community, especially considering the vast existence of CUDA-based legacy codes.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the portability and performance issues of CUDA code on different hardware architectures. Specifically, the paper explores the SYCL standard and its implementation (especially Intel's oneAPI) in the application of migrating the bio - sequence alignment tool SW# from CUDA to SYCL. The paper focuses on how to automatically or semi - automatically convert CUDA code to SYCL code through the SYCLomatic tool, and evaluate the performance and portability of the converted code on different hardware platforms. ### Paper Background 1. **Bio - sequence Alignment**: This is a fundamental operation in bioinformatics and computational biology, aiming to identify the structural, functional, and evolutionary relationships between sequences by aligning them. 2. **SW# Suite**: SW# is a software for bio - sequence alignment, which supports the alignment of protein and DNA sequences and can perform pairwise alignment and database similarity searches. 3. **Hardware Accelerators**: To improve performance and energy efficiency, modern high - performance computing systems widely use hardware accelerators, such as GPUs. CUDA is NVIDIA's proprietary programming language, widely used in GPU programming, but its proprietary nature limits the portability of the code on other hardware. 4. **SYCL and oneAPI**: SYCL is a cross - platform programming model based on C++, aiming to solve the portability problem of CUDA. Intel's oneAPI provides a set of tools and libraries to support SYCL programming, including the SYCLomatic tool for converting CUDA code to SYCL code. ### Research Objectives 1. **Migration Process**: Migrate SW# from CUDA to SYCL and evaluate the efficiency of the SYCLomatic tool. 2. **Performance Evaluation**: Test the performance of the migrated code on different hardware platforms (including NVIDIA, AMD, and Intel's GPUs and CPUs). 3. **Portability Analysis**: Verify the portability of the migrated code on different hardware platforms, ensuring that the code can run normally and have stable performance on different architectures. ### Main Contributions 1. **Complete Migration**: Achieve the complete migration of SW#, not limited to specific functional modules. 2. **Tool Evaluation**: Analyze in detail the performance of the SYCLomatic tool in the migration process, including the parts that need to be manually modified. 3. **Performance and Portability**: Conduct extensive tests on multiple hardware platforms to verify the performance and portability of SYCL code. ### Conclusion The paper shows that SYCL and oneAPI can provide attractive opportunities for the bioinformatics community, especially for the large amount of existing CUDA legacy code. Through the SYCLomatic tool, CUDA code can be effectively migrated to SYCL, and maintain good performance and portability on different hardware platforms. Although some manual adjustments are required during the migration process, overall, the efficiency of the SYCLomatic tool is relatively high and can significantly reduce development and maintenance costs.