Online Multimodal Co-indexing and Retrieval of Social Media Data

Lei Meng,Ah-Hwee Tan,Donald C. Wunsch II
DOI: https://doi.org/10.1007/978-3-030-02985-2_7
2019-01-01
Abstract:Effective indexing of socialSocial media|( mediaMedia|( data is key to searching for information on the social Web. However, the characteristics of social media dataData make it a challenging task. The large-scale and streaming nature is the first challenge, which requires the indexing algorithmAlgorithm to be able to efficiently update the indexing structureStructure when receiving data streams. The second challenge is utilizing the rich meta-information of social media data for a better evaluation of the similaritySimilarity between data objects and for a more semantically meaningful indexing of the data, which may allow the users toSearch|( search for them using the different types of queries they like. Existing approachesApproach based on either matrixMatrix operations or hashingHashing usually cannot perform an online update of the indexing base to encode upcoming data streams, and they have difficulty handling noisy data. This chapter presents a study on using theOnline multimodal co-indexing adaptive resonance theory|( OnlineOnline multimodal co-indexing MultimodalMultimodal co-indexing|( Co-indexingCo-indexing Adaptive ResonanceResonance TheoryTheory (OMC-ART)Adaptive resonance theory for an effective and efficient indexing and retrievalRetrieval|( of social media data. More specifically, two types of social media data are considered: (1) the weakly supervised image data, which is associated with captions, tagsTag and descriptions given by the users; and (2) the e-commerceE-commerce product data, which includes product images, titles, descriptions and user comments. These scenarios make this study related to multimodal web image indexing and retrieval. Compared with existing studies, OMC-ARTOMC-ARTonline multimodal co-indexing adaptive resonance theory has several distinct characteristics. First, OMC-ART is able to perform online learningOnline learning|( of sequential data. Second, instead of a plain indexing structure, OMC-ART builds a two-layer one, in which the first layer co-indexes the images by the key visual and textual features based on the generalizedGeneralized distributionsDistribution of the clustersCluster they belong to; while in the second layer, the data objects are co-indexed by their own featureFeature|( distributions. Third, OMC-ART enables flexible multimodal searching by using either visual features, keywords, or a combination of both. Fourth, OMC-ART employs a rankingRanking|( algorithmAlgorithm that does not need to go through the whole indexing systemSystem when only a limited number of images need to be retrieved. Experiments on two publicly accessible image datasetsDataset and a real-world e-commerce dataset demonstrate the efficiency and effectiveness of OMC-ART. The content of this chapter is summarized and extended from [13] ( https://doi.org/10.1145/2671188.2749362 ), and the PythonPython codes ofOnline multimodal co-indexing OMC-ARTAdaptive resonance theory with examples on building an e-commerceE-commerce product search engine are availableAvailable at https://github.com/Lei-Meng/OMC-ART-Build-a-toy-online-search-engine- .
What problem does this paper attempt to address?