Abstract:The work is devoted to the topical problem at the intersection of communications theory, digital electronics and numerical analysis, namely the study of image processing methods implementation time on different architectures of computational devices, which are used for software and hardware acceleration. The subject of this article is the investigation of reconfigurable FPGA processing systems in the image processing area. The goal of this work is to create a reconfigurable FPGA-based image processing system and compare it with existing processing architectures. Task. To fulfill the requirements of this work, it is necessary to prepare a practical experiment as well as theoretical research of the proposed architecture; to investigate the process of creating a ZYNQ SoC-based image processing system; and to develop and benchmark the speed of execution for the given set of algorithms with the specific range of the picture resolution. Methods used: FPGA simulation, C++ parallel programming with OpenMP, NVIDIA CUDA, performance analysis tools. The result of this work is the development of a resilient SoC Zynq7000–based computing system with programmable logic and the possibility to load images to FPGA RAM using the resources of ARM core for further processing and output via HDMI video interface, which enables the change of PL configuration at any time during the processing process. Conclusions. The efficiency of the FPGA approach was compared with a parallel image processing method implementation with OpenMP and CUDA. An overview of the ZYNQ platform with specific details related to media processing is presented. The analysis of algorithm speed testing findings based on various outputs proved the advantage (of over 60 times) of hardware acceleration of image processing over software analogs. The obtained results may be used in the development of embedded SoC-based solutions that require acceleration of big data processing. Also, the achieved findings can be used during the process of finding a suitable embedded platform for a certain image-processing task, where high data throughput is one of the most desired requirements.

A Scalable Hybrid Architecture for High Performance Data-Parallel Applications

A Ubiquitous Machine Learning Accelerator With Automatic Parallelization on FPGA

A hybrid ARM‐FPGA cluster for cryptographic algorithm acceleration

Hardware Implementation on FPGA for Task-Level Parallel Dataflow Execution Engine.

A high throughput acceleration for hybrid neural networks with efficient resource management on FPGA

A Hybrid Approach to Cache Management in Heterogeneous CPU-FPGA Platforms.

Coarse-Grain Performance Estimator for Heterogeneous Parallel Computing Architectures like Zynq All-Programmable SoC

FPGA Implementation of a Scheduler Supporting Parallel Dataflow Execution

Parallelizing Workload Execution in Embedded and High-Performance Heterogeneous Systems

A Scalable Multi-FPGA Platform for Hybrid Intelligent Optimization Algorithms

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation.

Analyzing Parallelization and Program Performance in Heterogeneous MPSoCs

A Hybrid Reconfigurable Architecture and Design Methods Aiming at Control-Intensive Kernels

Hardware Thread Accelerating Method Based on CPU/FPGA Hybrid Architecture

Scalable Light-Weight Integration of FPGA Based Accelerators with Chip Multi-Processors

Integrating FPGA-based hardware acceleration with relational databases

A Comprehensive Memory Management Framework for CPU-FPGA Heterogenous SoCs

Parallel dataflow execution for sequential programs on reconfigurable hybrid MPSoCs

High-Performance Simultaneous Multiprocessing for Heterogeneous System-on-Chip

Adaptation of FPGA architecture for accelerated image preprocessing

Accelerating Graph Analytics by Co-Optimizing Storage and Access on an FPGA-HMC Platform