Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey

Ngan Le,Vidhiwar Singh Rathour,Kashu Yamazaki,Khoa Luu,Marios Savvides
DOI: https://doi.org/10.48550/arXiv.2108.11510
2021-08-26
Abstract:Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision. We start with comprehending the theories of deep learning, reinforcement learning, and deep reinforcement learning. We then propose a categorization of deep reinforcement learning methodologies and discuss their advantages and limitations. In particular, we divide deep reinforcement learning into seven main categories according to their applications in computer vision, i.e. (i)landmark localization (ii) object detection; (iii) object tracking; (iv) registration on both 2D image and 3D image volumetric data (v) image segmentation; (vi) videos analysis; and (vii) other applications. Each of these categories is further analyzed with reinforcement learning techniques, network design, and performance. Moreover, we provide a comprehensive analysis of the existing publicly available datasets and examine source code availability. Finally, we present some open issues and discuss future research directions on deep reinforcement learning in computer vision
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper "Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey" aims to provide a comprehensive review of the latest research advancements in the field of deep reinforcement learning (DRL) within computer vision. Specifically, the paper attempts to address the following issues: 1. **Theoretical Understanding**: Gain an in-depth understanding of the fundamental theories of deep learning, reinforcement learning, and deep reinforcement learning. 2. **Method Classification**: Propose a method for classifying deep reinforcement learning approaches and discuss the advantages and limitations of these methods. 3. **Application Analysis**: Categorize the applications of deep reinforcement learning into 7 main categories, including: - **Landmark Localization** - **Object Detection** - **Object Tracking** - **Registration on both 2D image and 3D image volumetric data** - **Image Segmentation** - **Video Analysis** - **Other Applications** 4. **Datasets and Code**: Provide a comprehensive analysis of existing public datasets and examine the availability of source code. 5. **Future Directions**: Propose some open questions and discuss future research directions. By addressing these issues, the paper aims to provide readers with an in-depth understanding of the principles of deep reinforcement learning and a comprehensive coverage of the latest application examples, particularly in computer vision tasks.