Abstract:We present a unified programming model for heterogeneous computing systems. Such systems integrate multiple computing accelerators and memory units to deliver higher performance than CPU-centric systems. Although heterogeneous systems have been adopted by modern workloads such as machine learning, programming remains a critical limiting factor. Conventional heterogeneous programming techniques either impose heavy modifications to the code base or require rewriting the program in a different language. Such programming complexity stems from the lack of a unified abstraction layer for computing and data exchange, which forces each programming model to define its abstractions. However, with the emerging cache-coherent interconnections such as Compute Express Link, we see an opportunity to standardize such architecture heterogeneity and provide a unified programming model. We present CodeFlow, a language runtime system for heterogeneous computing. CodeFlow abstracts architecture computation in programming language runtime and utilizes CXL as a unified data exchange protocol. Workloads written in high-level languages such as C++ and Rust can be compiled to CodeFlow, which schedules different parts of the workload to suitable accelerators without requiring the developer to implement code or call APIs for specific accelerators. CodeFlow reduces programmers' effort in utilizing heterogeneous systems and improves workload performance.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the high programming complexity and poor scalability in heterogeneous computing systems. Specifically, although current heterogeneous computing systems can provide higher performance than traditional CPU - centered systems by integrating multiple computing accelerators and memory units, their programming models have significant limitations. Traditional heterogeneous programming techniques usually require a large number of modifications to the code base or rewriting the program in different languages, mainly because of the lack of a unified computing and data exchange abstraction layer. Each programming model needs to define its own abstraction, which leads to an increase in programming complexity. To solve these problems, the paper proposes a unified programming model named CodeFlow. CodeFlow simplifies the programming of heterogeneous systems in the following ways: 1. **Unified programming model**: CodeFlow uses the WebAssembly System Interface (WASI) to build a language runtime system as an intermediate layer between high - level languages and low - level heterogeneous architectures. This enables programs to be written in a single language and compiled into a single runtime representation without explicitly implementing code for different architectures or using multiple toolchains. 2. **Utilizing CXL to achieve consistent memory sharing**: At a low level, CodeFlow utilizes Compute Express Link (CXL) to achieve consistent memory sharing between heterogeneous accelerators, thereby reducing the explicit library calls required for cross - device data movement. 3. **Multithreading programming model**: CodeFlow allows heterogeneous code to be written as ordinary multithreaded code, using the native support of the language to handle standard multithreading mechanisms such as synchronization and memory sharing. The runtime system is responsible for scheduling different threads to run on different accelerators, handling memory sharing through CXL, and just - in - time compiling the code to adapt to the accelerator architecture. 4. **Compatibility with existing multithreaded programs**: CodeFlow also allows traditional multithreaded programs (originally designed for multi - CPU systems) to take advantage of heterogeneous systems by recompiling with the CodeFlow toolchain. Through these methods, CodeFlow significantly simplifies heterogeneous programming, enabling developers to more easily utilize the high - performance advantages of heterogeneous systems.

Fork is All You Need in Heterogeneous Systems

Concurrent CPU-GPU Task Programming using Modern C++

HeteroFlow: An Accelerator Programming Model with Decoupled Data Placement for Software-Defined FPGAs

A Unified Programming Model for Heterogeneous Computing with CPU and Accelerator Technologies

HeteroPP: A directive‐based heterogeneous cooperative parallel programming framework

Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System

A Task Parallel Programming Framework Based on Heterogeneous Computing Platforms

Parallel Model Research on the Heterogeneous Computer System

Programming Framework for Node Heterogeneous GPU Cluster

High-performance computing: Transitioning from Instruction-Level Parallelism to heterogeneous hybrid architectures

A Study of Heterogeneous Computing Design Method based on Virtualization Technology

Taskflow: A General-Purpose Parallel and Heterogeneous Task Programming System

Simplified High Level Parallelism Expression on Heterogeneous Systems Through Data Partition Pattern Description.

Performance on HPC Platforms Is Possible Without C++

HPC Alongside User-space Kubernetes

Optimization of Lattice Boltzmann Simulations on Heterogeneous Computers

Modeling For Heterogeneous Bulk Synchronous Parallel Computing

Heterogeneous Computing on Mobile GPU-FPGA Cooperation Platform

Unified Programming Models for Heterogeneous High-Performance Computers.

OpenH: A Novel Programming Model and API for Developing Portable Parallel Programs on Heterogeneous Hybrid Servers

NoT: a High-Level No-Threading Parallel Programming Method for Heterogeneous Systems