A multiple instances approach to improving keyword spotting on historical Mongolian document images

Hongxi Wei,Guanglai Gao,Xiangdong Su
DOI: https://doi.org/10.1109/ICDAR.2015.7333738
2015-01-01
Abstract:For keyword spotting of historical Mongolian document images, when user provides different instance image for the same query keyword, the performance will vary a lot. This paper proposed an approach to solving the above problem. Particularly, the whole procedure of keyword spotting is divided into two stages. The main task of the first stage is to generate multiple ranking lists for a query keyword. And the aim of the second stage is to merge the multiple ranking lists to form a final ranking. In the first stage, the ranking list of one query keyword is firstly returned by traditional image matching and then a number of instances for the query keyword are obtained using pseudo relevant feedback. Next, each instance of the query keyword can return the corresponding ranking list separately. In the second stage, the multiple ranking lists from the multiple instances of the query keyword are combined by the data fusion technique. The final ranking will be taken as the retrieval results of the query keyword. The experimental results show that the proposed approach can significantly improve the performance of keyword spotting for the historical Mongolian document images.
What problem does this paper attempt to address?