Semantic image understanding : from pixel to word

Hao Fu
2012-12-13
Abstract:The aim of semantic image understanding is to reveal the semantic meaning behind the image pixel. This thesis investigates problems related to semantic image understanding, and have made the following contributions. Our first contribution is to propose the usage of histogram matching in Multiple Kernel Learning. We treat the two-dimensional kernel matrix as an image and transfer the histogram matching algorithm in image processing to kernel matrix. Experiments on various computer vision and machine learning datasets have shown that our method can always boost the performance of state of the art MKL methods. Our second contribution is to advocate the segment-then-recognize strategy in pixel-level semantic image understanding. We have developed a new framework which tries to integrate semantic segmentation with low-level segmentation for proposing object consistent regions. We have also developed a novel method trying to integrate semantic segmentation with interactive segmentation. We found this segment-then-recognize strategy also works well on medical image data, where we designed a novel polar space random field model for proposing gland-like regions. In the realm of image-level semantic image understanding, our contribution is a novel way to utilize the random forest. Most of the previous works utilizing random forest store the posterior probabilities at each leaf node, and each random tree in the random forest is considered to be independent from each other. In contrast, we store the training samples instead of the posterior probabilities at each leaf node. We consider the random forest as a whole and propose the concept of semantic nearest neighbor and semantic similarity measure. Based on these two concepts, we devise novel methods for image annotation and image retrieval tasks.
What problem does this paper attempt to address?