Tone Recognition Using Lifters and CTC

Loren Lugosch,Vikrant Singh Tomar
DOI: https://doi.org/10.48550/arXiv.1807.02465
2018-07-07
Abstract:In this paper, we present a new method for recognizing tones in continuous speech for tonal languages. The method works by converting the speech signal to a cepstrogram, extracting a sequence of cepstral features using a convolutional neural network, and predicting the underlying sequence of tones using a connectionist temporal classification (CTC) network. The performance of the proposed method is evaluated on a freely available Mandarin Chinese speech corpus, AISHELL-1, and is shown to outperform the existing techniques in the literature in terms of tone error rate (TER).
Audio and Speech Processing,Sound
What problem does this paper attempt to address?