EFRNet: Efficient Feature Reconstructing Network for Real-Time Scene Parsing
Xin Li,Fan Yang,Ao Luo,Zhicheng Jiao,Hong Cheng,Zicheng Liu
DOI: https://doi.org/10.1109/tmm.2021.3089422
IF: 7.3
2021-01-01
IEEE Transactions on Multimedia
Abstract:In this paper, we introduce a light-weight and powerful convolutional neural network, termed as efficient feature reconstructing network (EFRNet), for real-time scene parsing. Our key idea is to decompose the process of learning high-resolution representations into two stages: i) bottom-up codebook/coding matrix learning and ii) top-down feature reconstructing. Specifically, the bottom-up process focuses on learning image-specific codewords (codebook) using deep-layer features and generating a coding matrix with the shallow-layer feature map. In the top-down process, the learned codebook and coding matrix are used to rebuild high-resolution features via a lightweight feature reconstructing operator (FRO). In addition, our EFRNet is constructed on a new building block, named efficient adaptive abstraction (EAA) block, to further reduce the overall network parameters and achieve a significant speed up. Extensive experiments are conducted on challenging benchmarks, such as CamVid and Cityscapes. The results show that EFRNet demonstrates state-of-the-art performance with an optimal balance between accuracy and speed.
computer science, information systems,telecommunications, software engineering