Construction Algorithm of Sub-word Unit in Speech Retrieval

YANG Le,WU Ji,LV Ping
DOI: https://doi.org/10.3969/j.issn.1000-3428.2012.24.059
2012-01-01
Abstract:In order to solve the Out-of-Vocabulary(OOV) problem in speech retrieval tasks,this paper presents a construction algorithm of sub-word units based on Maximum Mutual Information and Minimum Description Length(MMI-MDL).It selects candidate pairs according to the mutual information of sub-word pairs,judges whether combining the pairs to a new sub-word through MDL.After getting the sub-word set,map the word into sub-word for retrieval.Experimental results show that compared with the MDL algorithm,the proposed method has a better performance,and achieves a 12.1% relative improvement on the OOV recall rate.
What problem does this paper attempt to address?