Deep Features Representation of Word Image for Keyword Spotting in Historical Mongolian Document Images

Hongxi Wei,Jing Zhang,Hui Zhang
DOI: https://doi.org/10.1109/ICTAI50040.2020.00071
2020-01-01
Abstract:Due to degradation of historical Mongolian documents, a task for retrieving them is challenging. In the field of document image retrieval, keyword spotting technology is an alternative when optical character recognition is infeasible. Representation of word images plays a very important role in keyword spotting. In this paper, various of convolutional neural networks have been used for representing word images of historical Mongolian documents. To be specific, activations of the fully-connected layer in convolutional neural network are extracted and taken as representation vectors of word images. And then, similarity can be calculated between their representation vectors of word images. Several classic structures of convolutional neural networks have been compared with each other and the best one has been determined. Furthermore, convolutional neural network has been also compared with several baselines and the state-of-the-art method on a dataset of historical Mongolian documents. Experimental results indicates that the performance of convolutional neural network is superior to these baseline and state-of-the-art methods.
What problem does this paper attempt to address?