Multi-modal Feature Fusion for Geographic Image Annotation.

Ke Li,Changqing Zou,Shuhui Bu,Yun Liang,Jian Zhang,Minglun Gong
DOI: https://doi.org/10.1016/j.patcog.2017.06.036
IF: 8
2017-01-01
Pattern Recognition
Abstract:•Multi-modal feature construction: as for the shallow modality features, we propose a mixed shallow feature model which combines Color, LBP, and SIFT features to represent the extrinsic visual properties of geographic images; as for the deep modality features, we design a specialized DCNN to extract the intrinsic semantic information for geographic images.•Multi-modal feature fusion: we propose a multi-modal feature fusion model based on DBNs and RBM to build a powerful joint representation for geographic images. The model has been shown to be effective to capture both the intrinsic and extrinsic semantic information.•Open geographic image dataset: we have built a geographic image dataset which contains 300 images (600  ×  600) in six typical areas such as urban, rural, and mountain.
What problem does this paper attempt to address?