Multi-modal, multi-resource methods for placing Flickr videos on the map

Pascal Kelm,Sebastian Schmiedeke,Thomas Sikora
DOI: https://doi.org/10.1145/1991996.1992048
2011-01-01
Abstract:We present three approaches for placing videos in Flickr on the world map. The toponym extraction and geo lookup approach makes use of external resources to identify toponyms in the metadata and associate them with geo-coordinates. The metadata-based region model approach uses a k-nearest-neighbour classifier trained over geographical regions. Videos are represented using their metadata in a text space with reduced dimensionality. The visual region model approach uses a support vector machine also trained over geographical regions. Videos are represented using low-level feature vectors from multiple key frames. Voting methods are used to form a single decision for each video. We compare the approaches experimentally, highlighting the importance of using appropriate metadata features and suitable regions as the basis of the region model. The best performance is achieved by the geo-lookup approach used with fallback to the visual region model when the video metadata contains no toponym.
What problem does this paper attempt to address?