Information-Theoretic Structure For Visual Signal Understanding

Yue Deng
DOI: https://doi.org/10.1007/978-3-662-44526-6_6
2015-01-01
Abstract:In this chapter, we will investigate the performance information-theoretic structure for visual information understanding. Bag-of-feature method provides a flexible way to extract the contents of an image in a data-driven manner for visual recognition. One central task in such framework is codeword assignment, which allocates local image descriptors to the most similar codewords in the dictionary to generate histogram for categorization. Nevertheless, existing assignment approaches suffer from two problems: (1) too strong Euclidean assumption and (2) neglecting the label information of the local descriptors. To address the aforementioned two challenges, we propose a graph assignment method with maximal mutual information (GAMI) regularization in this chapter. GAMI takes the power of manifold structure to better reveal the relationship of massive number of local features by non-linear graph metric. Meanwhile, the mutual information of descriptor-label pairs is ultimately optimized in the embedding space for the sake of enhancing the discriminant property of the selected codewords. According to such objective, two optimization models, i.e., inexact-GAMI and exact-GAMI, are, respectively, proposed in this chapter. The inexact model can be efficiently solved with a closed-form solution. The stricter exact-GAMI nonparametrically estimates the entropy of descriptor-label pairs in the embedding space and thus leads to a relatively complicated but still trackable optimization. The effectiveness of GAMI models is verified on both the public and our own datasets.
What problem does this paper attempt to address?