Light Field Image Compression Using Multi-branch Spatial Transformer Networks Based View Synthesis

Jin Wang,Qianwen Wang,Ruiqin Xiong,Qing Zhu,Baocai Yin
DOI: https://doi.org/10.1109/dcc47342.2020.00047
2020-03-01
Abstract:In this paper, we propose a novel light field image compression scheme using multi-branch spatial transformer networks based view synthesis. Firstly, a sparse subset of views are selected and are rearranged into a pseudo sequence to be encoded by an video codec at encoder. Then the other unselected views are synthesized based on the similarity between neighboring views with our proposed method at decoder. To better characterize the non-linear relationship between the sub-views, a multi-branch spatial transformer networks (MSTN) is designed to adaptively learn the affine transformations between the neighboring views, which are used to warp the input views to generate accurate high-order approximation of the target views. The advantage of MSTN lies in not only no explicit requirement for depth information from the input views but also the better capacity of transforming data spatially compared with previous methods. Moreover, to better obtain the final view by the generated high-order approximation views, the Wasserstein generative adversarial networks(WGAN) is applied after MSTN with the improved training. The WGAN aims directly at mapping the high order approximations from the output of MSTN to the generated high quality view. Experimental results show the superior compression performance of our scheme compared with the state-of-the-art methods.
What problem does this paper attempt to address?