Abstract:Traditional image signal processors (ISPs) are primarily designed and optimized to improve the image quality perceived by humans. However, optimal perceptual image quality does not always translate into optimal performance for computer vision applications. We propose a set of methods, which we collectively call VisionISP, to repurpose the ISP for machine consumption. VisionISP significantly reduces data transmission needs by reducing the bit-depth and resolution while preserving the relevant information. The blocks in VisionISP are simple, content-aware, and trainable. Experimental results show that VisionISP boosts the performance of a subsequent computer vision system trained to detect objects in an autonomous driving setting. The results demonstrate the potential and the practicality of VisionISP for computer vision applications.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the deficiencies of traditional image signal processors (ISPs) in computer vision applications. Specifically: 1. **Limitations of the Optimization Goals of Traditional ISPs**: - Traditional ISPs are mainly optimized for the image quality perceived by humans to generate high - quality images suitable for human viewing. - However, this optimization does not always improve the performance of computer vision systems and sometimes even degrades their performance. 2. **Challenges of Data Transmission Requirements and Computational Resources**: - Computer vision applications usually need to process image data with high resolution and large bit - depth, which will lead to a large amount of data transmission requirements and consumption of computational resources. - Under the requirements of low power consumption and low latency, how to efficiently transmit and process these data is an important issue. 3. **Requirements for Content Awareness and Feature Preservation**: - Traditional down - scaling methods (such as bilinear interpolation) are content - independent and are prone to losing small details (such as pedestrians), which are crucial for computer vision tasks (such as object detection in autonomous driving). To solve these problems, the paper proposes the VisionISP framework, aiming to redesign the ISP to meet the requirements of computer vision applications. VisionISP improves the traditional ISP in the following ways: - **Vision - Driven Denoising (Vision Denoiser)**: Adjust the denoising parameters of the ISP to optimize the performance of computer vision tasks, rather than simply pursuing the image quality perceived by humans. - **Vision Local Tone Mapping (VLTM)**: Through non - linear transformation and detail enhancement operations, try to maintain the key information in the image while reducing the bit - depth. - **Trainable Vision Scaler (TVS)**: Use a neural network framework to down - scale the image, while preserving important low - level features and supporting flexible scaling ratios. Through these improvements, VisionISP not only reduces the data transmission requirements but also improves the performance of subsequent computer vision systems, especially performing well in application scenarios such as autonomous driving.

VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Sad-Based Stereo Vision Machine on A System-On-Programmable-Chip (Sopc)

Refactoring ISP for High-Level Vision Tasks

A Reconfigurable Convolution-in-Pixel CMOS Image Sensor Architecture

AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection

ReconfigISP: Reconfigurable Camera Image Processing Pipeline

Reconfiguring the Imaging Pipeline for Computer Vision

A Tightly Coupled AI-ISP Vision Processor

DynamicISP: Dynamically Controlled Image Signal Processor for Image Recognition

AI-assisted ISP hyperparameter auto tuning

PQDynamicISP: Dynamically Controlled Image Signal Processor for Any Image Sensors Pursuing Perceptual Quality

ISP Distillation

GenISP: Neural ISP for Low-Light Machine Cognition

ISP meets Deep Learning: A Survey on Deep Learning Methods for Image Signal Processing

LW-ISP: A Lightweight Model with ISP and Deep Learning

Automatic ISP image quality tuning using non-linear optimization

Uni-ISP: Unifying the Learning of ISPs from Multiple Cameras

MetaISP: Efficient RAW-to-sRGB Mappings with Merely 1M Parameters

ISP Parameter Optimization and FPGA Implementation for Object Detection in Low-Light Conditions

HISP: Heterogeneous Image Signal Processor Pipeline Combining Traditional and Deep Learning Algorithms Implemented on FPGA

MetaISP -- Exploiting Global Scene Structure for Accurate Multi-Device Color Rendition