Abstract:With the increasing demand for computing capability given limited resource and power budgets, it is crucial to deploy applications to customized accelerators like FPGAs. However, FPGA programming is non-trivial. Although existing high-level synthesis (HLS) tools improve productivity to a certain extent, they are limited in scope and capability to support sufficient FPGA-oriented optimizations. This paper focuses on FPGA-based accelerators and proposes POM, an optimizing framework built on multi-level intermediate representation (MLIR). POM has several features which demonstrate its scope and capability of performance optimization. First, most HLS tools depend exclusively on a single-level IR to perform all the optimizations, introducing excessive information into the IR and making debugging an arduous task. In contrast, POM introduces three layers of IR to perform operations at suitable abstraction levels, streamlining the implementation and debugging process and exhibiting better flexibility, extensibility, and systematicness. Second, POM integrates the polyhedral model into MLIR, enabling advanced dependence analysis and various FPGA-oriented loop transformations. By representing nested loops with integer sets and maps, loop transformations can be conducted conveniently through manipulations on polyhedral semantics. Finally, to further relieve design effort, POM has a user-friendly programming interface (DSL) that allows a concise description of computation and includes a rich collection of scheduling primitives. An automatic design space exploration (DSE) engine is provided to search for high-performance optimization schemes efficiently and generate optimized accelerators automatically. Experimental results show that POM achieves a $6.46\times$ average speedup on typical benchmark suites and a $6.06\times$ average speedup on real-world applications compared to the state-of-the-art.

MOCHA: Multinode Cost Optimization in Heterogeneous Clouds with Accelerators

A cost-effective approach to improving performance of big genomic data analyses in clouds

Seeing Shapes in Clouds: On the Performance-Cost trade-off for Heterogeneous Infrastructure-as-a-Service

PECCO: A Profit and Cost-oriented Computation Offloading Scheme in Edge-Cloud Environment with Improved Moth-flame Optimisation

Optimizing Offload Performance in Heterogeneous MPSoCs

Enabling FPGAs in the cloud.

HCOME: Research on Hybrid Computation Offloading Strategy for MEC Based on DDPG

Acceleration-as-a-μService: A Cloud-native Monte-Carlo Option Pricing Engine on CPUs, GPUs and Disaggregated FPGAs

CHEF: A Framework for Deploying Heterogeneous Models on Clusters With Heterogeneous FPGAs

MAMoC: Multisite Adaptive Offloading Framework for Mobile Cloud Applications

HEXA-MoE: Efficient and Heterogeneous-aware MoE Acceleration with ZERO Computation Redundancy

Online scheduling for FPGA computation in the Cloud

Multi-Objective Hardware-Mapping Co-Optimisation for Multi-DNN Workloads on Chiplet-based Accelerators

HMC-FHE: A Heterogeneous Near Data Processing Framework for Homomorphic Encryption

A Study of FPGA Virtualization and Accelerator Scheduling

Task Offloading for Scientific Workflow Application in Mobile Cloud

TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs

Heterogeneous Cloud Framework for Big Data Genome Sequencing.

Optimization of Computation-Intensive Applications in cc-NUMA Architecture

An Optimizing Framework on MLIR for Efficient FPGA-based Accelerator Generation

ENGINE:Cost Effective Offloading in Mobile Edge Computing with Fog-Cloud Cooperation