Abstract:Online social networking techniques and large-scale multimedia systems are developing rapidly, which not only has brought great convenience to our daily life, but generated, collected, and stored large-scale multimedia data. This trend has put forward higher requirements and greater challenges on massive multimedia data retrieval. In this paper, we investigate the problem of image similarity measurement which is used to lots of applications. At first we propose the definition of similarity measurement of images and the related notions. Based on it we present a novel basic method of similarity measurement named SMIN. To improve the performance of calculation, we propose a novel indexing structure called SMI Temp Index (SMII for short). Besides, we establish an index of potential similar visual words off-line to solve to problem that the index cannot be reused. Experimental evaluations on two real image datasets demonstrate that our solution outperforms state-of-the-art method.

What problem does this paper attempt to address?

This paper attempts to solve the problem of image similarity measurement in large - scale multimedia data retrieval. With the rapid development of online social network technology and large - scale multimedia systems, a vast amount of multimedia data is generated, collected and stored in people's daily lives, which poses higher requirements and greater challenges for large - scale multimedia data retrieval. Especially in the fields of image retrieval and image matching, how to measure the similarity between images efficiently and accurately has become a key issue. ### Main contributions of the paper: 1. **Define image similarity measurement and related concepts**: - The author first introduced the definition of image similarity measurement and related concepts, and designed an image similarity calculation function. 2. **Propose a basic image similarity measurement method (SMIN)**: - In order to improve the performance of similarity measurement, the author proposed a basic method named SMIN and optimized it on this basis. 3. **Design new index structures (SMI Temp Index and PSMI)**: - To further optimize the computational performance, the author designed a new index structure - SMI Temp Index (abbreviated as SMII), and solved the problem that the index cannot be reused by establishing an index of potentially similar visual words, thereby improving the computational efficiency. 4. **Experimental verification**: - The author carried out extensive experiments on two real - image datasets, and the results showed that their method is superior to the existing state - of - the - art methods. ### Core content of the paper: - **Problem definition**: By defining concepts such as image objects, visual word similarity, and similar visual word pairs, the theoretical basis for image similarity measurement was established. - **Algorithm design**: A basic measurement method SMIN based on visual word similarity was proposed, and the computational complexity was reduced and the efficiency was improved by optimizing the index structure (such as SMII and PSMI). - **Experimental evaluation**: Performance evaluations were carried out on two datasets, Flickr and ImageNet, to verify the effectiveness and superiority of the proposed method. ### Formula summary: - **Image similarity measurement formula**: \[ \text{Sim}_I(I_i(W_i), I_j(W_j))=\frac{\sum_{k = 1}^{l}\lambda_k\xi_i^k\xi_j^k}{\sqrt{\sum_{k = 1}^{m}\xi_i^k\sum_{k = 1}^{n}\xi_j^k}}\cdot\frac{\sum_{k = 1}^{l}\lambda_k^2\xi_i^k\xi_j^k+\sum_{k = l + 1}^{m}\xi_i^k\sum_{k = l + 1}^{n}\xi_j^k} \] where \( m \) and \( n \) are the numbers of visual words in images \( I_i(W_i) \) and \( I_j(W_j) \) respectively, \( l \) is the number of similar visual word pairs, \( \Lambda=\{\lambda_1,\lambda_2,\dots,\lambda_l\} \) is the similarity set, and \( \Xi_i = \{\xi_i^1,\xi_i^2,\dots,\xi_i^l\} \) and \( \Xi_j=\{\xi_j^1,\xi_j^2,\dots,\xi_j^l\} \) are the sets of visual word weights in the two images respectively. Through these improvements, the paper provides an efficient image similarity measurement method that can better cope with the challenges in large - scale multimedia data retrieval.

Efficient Multimedia Similarity Measurement Using Similar Elements

Research on Similarity Measurement in Multimedia Data Mining

A Similarity Metric in Image Searching.

Similarity measure research in multimedia information networks

Sparse Online Learning of Image Similarity

Combining similarity measures in content-based image retrieval guided by mutual information

Effective hashing for large-scale multimedia search.

An Efficient Similarity Search Algorithm for Web Video

On Image Similarity in the Context of Multimedia Social Computing

An Efficient Video Similarity Search Algorithm

Cross-media Retrieval by Intra-Media and Inter-Media Correlation Mining

Improving feature matching strategies for efficient image retrieval

Learning Based Neural Similarity Metrics for Multimedia Data Mining.

Semantic Discriminative Metric Learning for Image Similarity Measurement

Tri-space and Ranking Based Heterogeneous Similarity Measure for Cross-Media Retrieval.

Imagilar: A Real-Time Image Similarity Search System On Mobile Platform

Video Similarity Measurement

Measuring the Semantic Relatedness Between Images Using Social Tags.

SVS-JOIN: Efficient Spatial Visual Similarity Join over Multimedia Data

Efficient Similarity Search by Summarization in Large Video Database

Collaborative Similarity Metric Learning for Semantic Image Annotation and Retrieval.