Deep-based Fisher Vector for Mobile Visual Search

Chen Huang,Shengchuan Zhang,Xianming Lin,Xiangrong Liu,Rongrong Ji
DOI: https://doi.org/10.1109/icip.2017.8296919
2017-01-01
Abstract:We tackle the problem of mobile visual search. Moving pictures experts group (MPEG) has completed a standard named compact descriptor for visual search (CDVS) to provide a standardized syntax in the context of image retrieval application. CDVS applies principal components analysis to reduce the dimension of local feature descriptor as the input of global descriptor pipeline, and utilizes traditional fisher vector as the local feature descriptor aggregation algorithm. However, the descriptor components of SIFT and Fisher Vector (FV) have highly non-Gaussian statistics, and applying a single PCA transform can in-fact hurt compression performance at high rates. We develop a net-based architecture combining neural networks with FV layer to obtain fisher vector. There are two advantages in our architecture comparing with CDVS global descriptor pipeline. One is that we employ "autoencoder" networks to reduce the dimensionality of data, the other is that we exploit a trainable system to learn parameters after the FV codebook obtained. The experiments demonstrate an obvious advantage of our proposed architecture in terms of CDVS retrieval task.
What problem does this paper attempt to address?