An improved training procedure for connected-digit recognition

L. Rabiner,A. Bergh,J. Wilpon
DOI: https://doi.org/10.1002/J.1538-7305.1982.TB04327.X
1982-07-08
Bell Labs Technical Journal
Abstract:The “conventional” way of obtaining word reference patterns for connected-word recognition systems is to use isolated-word patterns, and to rely on the dynamics of the matching algorithm to account for the differences in connected speech. Connected-word recognition, based on such an approach, tends to become unreliable (high-error rates) when the talking rate becomes grossly incommensurate with the rate at which the isolated-word training patterns were spoken. To alleviate this problem, an improved training procedure for connected-word (digit) recognition is proposed in which word reference patterns from isolated occurrences of the vocabulary words are combined with word reference patterns extracted from within connected-word strings to give a robust, reliable word recognizer over all normal speaking rates. The manner in which the embedded-word patterns are extracted was carefully studied, and it is shown that the robust training procedure of Rabiner and Wilpon can be used to give reliable patterns for the embedded, as well as the isolated, patterns. In a test of the system (as a speaker-trained, connected-digit recognizer) with 18 talkers, each speaking 40 different strings (of variable length from 2 to 5 digits), median-string error rates of 0 and 2.5 percent were obtained for deliberately spoken strings and naturally spoken strings, respectively, when the string length was known. Using only isolated-word training tokens, the comparable error rates were 10 and 11.3 percent, respectively.
What problem does this paper attempt to address?