Large Scale Cross-Media Data Retrieval Based on Hadoop.

Wenchen Cheng,Jiang Qian,Zhicheng Zhao,Fei Su
DOI: https://doi.org/10.4108/eai.19-8-2015.2260108
2015-01-01
EAI Endorsed Transactions on Cloud Systems
Abstract:With the rapid development of the Internet and speedy increase of the data size, there are more and more data intensive applications which often involve hundreds of megabytes of data. It is important and necessary to obtain the retrieval results from cross-media data quickly and accurately. Large scale cross-media data retrieval based on Hadoop is proposed to speed up the retrieval in this paper. We divide cross-media feature extraction and cross-media retrieval into paralleled pipeline, and implement with the combination of the HDFS, HBase and MapReduce framework. To verify the performance of the proposed method, comparisons with stand-alone mode on different sizes of the image dataset are conducted, and the experimental results demonstrate the good performances of proposed method, which sharply decreases time-consuming, and meanwhile keeps the same query precision.
What problem does this paper attempt to address?