Word Image Representation Based on Sequence to Sequence Model with Attention Mechanism for Out-of-Vocabulary Keyword Spotting.

Hongxi Wei,Yanke Kang,Hui Zhang
DOI: https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00309
2019-01-01
Abstract:To realize keyword spotting by means of query-by-example, learning efficient representation for word images is an essential issue. However, the amount of vocabulary at the training stage is often far less than the complete vocabulary of a certain language in various learning based representation approaches. Thus, unseen vocabularies might be taken as query keywords which may not exist in training set. Therefore, out-of-vocabulary (OOV) is frequently occurred in keyword spotting. In this paper, a sequence to sequence model with attention mechanism has been proposed to generate representation vectors of word images for solving the problem of OOV. After that, similarities can be calculated between each word image and a given query keyword image on their representation vectors. And then, a ranking list can be formed in descending order of the similarities for a collection of word images. Experimental results demonstrate that the proposed representation approach can be competent for the task of OOV keyword spotting and outperforms various baseline and state-of-the-art methods.
What problem does this paper attempt to address?