Efficient Multimedia Similarity Measurement Using Similar Elements

Chengyuan Zhang,Yunwu Lin,Lei Zhu,Zuping Zhang,Xinpan Yuan,Fang Huang
DOI: https://doi.org/10.48550/arXiv.1809.03867
2018-09-08
Abstract:Online social networking techniques and large-scale multimedia systems are developing rapidly, which not only has brought great convenience to our daily life, but generated, collected, and stored large-scale multimedia data. This trend has put forward higher requirements and greater challenges on massive multimedia data retrieval. In this paper, we investigate the problem of image similarity measurement which is used to lots of applications. At first we propose the definition of similarity measurement of images and the related notions. Based on it we present a novel basic method of similarity measurement named SMIN. To improve the performance of calculation, we propose a novel indexing structure called SMI Temp Index (SMII for short). Besides, we establish an index of potential similar visual words off-line to solve to problem that the index cannot be reused. Experimental evaluations on two real image datasets demonstrate that our solution outperforms state-of-the-art method.
Multimedia
What problem does this paper attempt to address?
This paper attempts to solve the problem of image similarity measurement in large - scale multimedia data retrieval. With the rapid development of online social network technology and large - scale multimedia systems, a vast amount of multimedia data is generated, collected and stored in people's daily lives, which poses higher requirements and greater challenges for large - scale multimedia data retrieval. Especially in the fields of image retrieval and image matching, how to measure the similarity between images efficiently and accurately has become a key issue. ### Main contributions of the paper: 1. **Define image similarity measurement and related concepts**: - The author first introduced the definition of image similarity measurement and related concepts, and designed an image similarity calculation function. 2. **Propose a basic image similarity measurement method (SMIN)**: - In order to improve the performance of similarity measurement, the author proposed a basic method named SMIN and optimized it on this basis. 3. **Design new index structures (SMI Temp Index and PSMI)**: - To further optimize the computational performance, the author designed a new index structure - SMI Temp Index (abbreviated as SMII), and solved the problem that the index cannot be reused by establishing an index of potentially similar visual words, thereby improving the computational efficiency. 4. **Experimental verification**: - The author carried out extensive experiments on two real - image datasets, and the results showed that their method is superior to the existing state - of - the - art methods. ### Core content of the paper: - **Problem definition**: By defining concepts such as image objects, visual word similarity, and similar visual word pairs, the theoretical basis for image similarity measurement was established. - **Algorithm design**: A basic measurement method SMIN based on visual word similarity was proposed, and the computational complexity was reduced and the efficiency was improved by optimizing the index structure (such as SMII and PSMI). - **Experimental evaluation**: Performance evaluations were carried out on two datasets, Flickr and ImageNet, to verify the effectiveness and superiority of the proposed method. ### Formula summary: - **Image similarity measurement formula**: \[ \text{Sim}_I(I_i(W_i), I_j(W_j))=\frac{\sum_{k = 1}^{l}\lambda_k\xi_i^k\xi_j^k}{\sqrt{\sum_{k = 1}^{m}\xi_i^k\sum_{k = 1}^{n}\xi_j^k}}\cdot\frac{\sum_{k = 1}^{l}\lambda_k^2\xi_i^k\xi_j^k+\sum_{k = l + 1}^{m}\xi_i^k\sum_{k = l + 1}^{n}\xi_j^k} \] where \( m \) and \( n \) are the numbers of visual words in images \( I_i(W_i) \) and \( I_j(W_j) \) respectively, \( l \) is the number of similar visual word pairs, \( \Lambda=\{\lambda_1,\lambda_2,\dots,\lambda_l\} \) is the similarity set, and \( \Xi_i = \{\xi_i^1,\xi_i^2,\dots,\xi_i^l\} \) and \( \Xi_j=\{\xi_j^1,\xi_j^2,\dots,\xi_j^l\} \) are the sets of visual word weights in the two images respectively. Through these improvements, the paper provides an efficient image similarity measurement method that can better cope with the challenges in large - scale multimedia data retrieval.