Abstract:For heterogeneous parallel embedded systems, we exploit the time and power optimization in various aspects. In high-level architecture synthesis, we address high-level architecture synthesis for real-time Digital Signal Processing (DSP) using heterogeneous functional units (FUs). With more and more different types of FUs available, same type of operations can be processed by heterogeneous FUs with different costs, where the cost may relate to power, reliability, etc. Furthermore, some tasks may not have fixed execution time. Such tasks usually contain conditional instructions and/or operations that could have different execution times for different inputs. Therefore, for such special purpose architecture synthesis, an important problem is how to assign a proper function unit type to each operation of a DSP application and generate a schedule in such a way that we can minimize the total costs while satisfying timing constraints with guaranteed confidence probabilities. We propose several efficient algorithms to solve it. The experiments show that our algorithms can effectively reduce the total cost compared with the previous work.Low power is becoming a critical design issue and performance metric in embedded system design. DSP processor has multiple FUs and can process several instructions simultaneously. While this multiple-FU architecture can be exploited to increase instruction-level parallelism and improve time performance, it causes more power consumption. To solve this problem, several techniques have been proposed. We combine Dynamic Voltage Scaling (DVS) and soft real-time to solve the Voltage Assignment with Probability (VAP) Problem. VAP problem involves finding a voltage level to be used for each node of an Probabilistic Date Flow Graph (PDFG) in uniprocessor and multiprocessor DSP systems. This work tremendously improves the state-of-the-art techniques. Another application is heterogeneous sensor network. We apply our efficient algorithms to dynamic adjust the working mode of sensors and achieved significant energy saving. Also, we design new rotation scheduling algorithms for real-time applications that produce schedules consuming minimal energy. Furthermore, we combine data mining and prefetching to reduce energy consumption. All these three techniques significantly reduce energy consumption. Many high-performance DSP processors employ multi-bank on-chip memory to improve performance and energy consumption. This architectural feature supports higher memory bandwidth by allowing multiple data memory accesses to be executed in parallel. However, making effective use of multi-bank memory remains difficult, considering the combined effect of performance and energy requirement. In this project, our focus is to study the assignment and scheduling problem that minimizes the total energy while satisfying performance requirements. Our approach has several major contributions: First, we study the combined effects of energy-saving and performance of memory in a systematic approach. Second, we exploit the energy saving of memory with memory type assignment. Third, data locality has been improved by using variable partition.

From Circuits to SoC Processors: Arithmetic Approximation Techniques & Embedded Computing Methodologies for DSP Acceleration

Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications

Optimally Approximated and Unbiased Floating-Point Multiplier with Runtime Configurability

Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques

autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components

A Survey of Approximate Computing: From Arithmetic Units Design to High-Level Applications

Approximate Arithmetic Circuits: A Survey, Characterization, and Recent Applications

AxOMaP : Designing FPGA-based A ppro x imate Arithmetic O perators using Ma thematical P rogramming

AxOMaP: Designing FPGA-based Approximate Arithmetic Operators using Mathematical Programming

A Genetic-algorithm-based Approach to the Design of DCT Hardware Accelerators

AxOCS: Scaling FPGA-based Approximate Operators using Configuration Supersampling

Time and power optimization for heterogeneous parallel embedded systems

A Hardware/Software Co-Design Methodology for Adaptive Approximate Computing in Clustering and ANN Learning

A FPGA Friendly Approximate Computing Framework with Hybrid Neural Networks: (Abstract Only).

PAM: A Piecewise-Linearly-Approximated Floating-Point Multiplier with Unbiasedness and Configurability

A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits

A Survey on Design Space Exploration Approaches for Approximate Computing Systems

Approximate Logic Synthesis and Its Application in Image Signal Processor

Approximate Computing Based Low-Power FPGA Design for Big Data Analytics in Cloud Environments

APIR-DSP: an Approximate PIR-DSP Architecture for Error-Tolerant Applications

Characterizing Approximate Adders and Multipliers Optimized under Different Design Constraints