Semantic Translation With Convolutional Encoder-Decoder Networks For Viewpoint Estimation

Liang-Jun Zhang,Changjian Gu,Chaochen Gu,Kaijie Wu,Xinping Guan
DOI: https://doi.org/10.1109/ASCC.2017.8287423
2017-01-01
Abstract:Viewpoint estimation is an essential procedure in vision-based robotic manipulation. To address the scarcity of feature points on textureless objects, which hinders the generalization of classical methods, we proposed a new pipeline of viewpoint estimation, introducing semantic translation methods to highlight the structures of interest (SOIs) as foregrounds. In our method, a convolutional encoder-decoder network is applied as the generator of semantic segmentation, and we explore the adversarial training strategy with a conditional adversarial network as a discriminator to obtain finer details. We also contribute a dataset corresponding to the experiment, and perform viewpoint estimation based on the semantic outputs. Furthermore, we install our pipeline onto a robotic eye-in-hand system to complete a viewpoint transfer task. The experimental results show our proposed method (1) works on textureless objects for feature extraction,(2) is able to improve the semantic translation with adversarial training, and (3) has applicability for real robotic manipulation tasks.
What problem does this paper attempt to address?