Learning Model-Based F0 Production Through Goal-Directed Babbling

Hao Liu,Yi Xu
DOI: https://doi.org/10.1109/iscslp.2014.6936720
2014-01-01
Abstract:How surface acoustics can be mapped to underlying articulatory commands is a central yet unsolved issue about speech acquisition. Previously, stochastic optimization has been shown to be proficient in learning underlying pitch targets of the quantitative Target Approximation (qTA) model. The present study tested whether it is possible to develop an acoustic-to-articulatory inverse model for qTA by taking advantages of a recent advance in inverse kinematics learning in the field of developmental robotics, known as goal babbling. By treating traditionally separated babbling and imitation stages of speech acquisition as a unified acoustic goal-directed babbling process, the inverse model implemented by a multilayer perceptron (MLP) can be bootstrapped rapidly without the necessity of exploring the whole articulatory command space. The MLP was trained in online mode with self-generated examples obtained after every production of the host learner. The results show that with this novel learning paradigm the inverse model can be improved in a progressive manner and underlying pitch targets can be obtained by querying the mature inverse model. Our findings also demonstrate that qTA is an intrinsically robust F0 production model that can be operated by various learning regimens.
What problem does this paper attempt to address?