VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Chyuan-Tyng Wu,Leo F. Isikdogan,Sushma Rao,Bhavin Nayak,Timo Gerasimow,Aleksandar Sutic,Liron Ain-kedem,Gilad Michael
DOI: https://doi.org/10.1109/ICIP.2019.8803607
2019-11-14
Abstract:Traditional image signal processors (ISPs) are primarily designed and optimized to improve the image quality perceived by humans. However, optimal perceptual image quality does not always translate into optimal performance for computer vision applications. We propose a set of methods, which we collectively call VisionISP, to repurpose the ISP for machine consumption. VisionISP significantly reduces data transmission needs by reducing the bit-depth and resolution while preserving the relevant information. The blocks in VisionISP are simple, content-aware, and trainable. Experimental results show that VisionISP boosts the performance of a subsequent computer vision system trained to detect objects in an autonomous driving setting. The results demonstrate the potential and the practicality of VisionISP for computer vision applications.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the deficiencies of traditional image signal processors (ISPs) in computer vision applications. Specifically: 1. **Limitations of the Optimization Goals of Traditional ISPs**: - Traditional ISPs are mainly optimized for the image quality perceived by humans to generate high - quality images suitable for human viewing. - However, this optimization does not always improve the performance of computer vision systems and sometimes even degrades their performance. 2. **Challenges of Data Transmission Requirements and Computational Resources**: - Computer vision applications usually need to process image data with high resolution and large bit - depth, which will lead to a large amount of data transmission requirements and consumption of computational resources. - Under the requirements of low power consumption and low latency, how to efficiently transmit and process these data is an important issue. 3. **Requirements for Content Awareness and Feature Preservation**: - Traditional down - scaling methods (such as bilinear interpolation) are content - independent and are prone to losing small details (such as pedestrians), which are crucial for computer vision tasks (such as object detection in autonomous driving). To solve these problems, the paper proposes the VisionISP framework, aiming to redesign the ISP to meet the requirements of computer vision applications. VisionISP improves the traditional ISP in the following ways: - **Vision - Driven Denoising (Vision Denoiser)**: Adjust the denoising parameters of the ISP to optimize the performance of computer vision tasks, rather than simply pursuing the image quality perceived by humans. - **Vision Local Tone Mapping (VLTM)**: Through non - linear transformation and detail enhancement operations, try to maintain the key information in the image while reducing the bit - depth. - **Trainable Vision Scaler (TVS)**: Use a neural network framework to down - scale the image, while preserving important low - level features and supporting flexible scaling ratios. Through these improvements, VisionISP not only reduces the data transmission requirements but also improves the performance of subsequent computer vision systems, especially performing well in application scenarios such as autonomous driving.