Active Appearance Model Based Contour Extraction for MRI Images of Human Tongue

Zhi-cheng Liu,Qi-long Sun,Jian-guo Wei
DOI: https://doi.org/10.2991/masta-19.2019.24
2019-01-01
Abstract:In this article, we present the results of automatic extraction of speech articulator contours from Magnetic Resonance Imaging movie by employing the Active Appearance Model. An Active Appearance Model based framework is proposed to deal with the high nonlinear property of articulatory deformation during articulation, which demonstrates the advantage for tracking articulators shape from noisy MRI images. The extraction of the vocal tract contour was carried on MRI movies from Chinese subjects. The performance of this framework was evaluated by comparing manually labeled contours with automatically extracted ones. The average error is around 2.1 pixels. Introduction Speech is one of the most important functions of human communication. However, the mechanism of speech production is far from being fully discovered. The morphological and dynamic aspects of speech organs are the essential for understanding the knowledge of speech dynamic. Advanced imaging and image processing technologies are important for this research field. Magnetic Resonance Imaging (MRI) is able to produce high-resolution images of human articulators. This function makes MRI currently one of the most promising means for speech research and hence has been widely used in study speech production [1-3]. A set of databases of MRI image of human speech organs have been available for various purposes. A necessary procedure to use such databases, however, is a successful extraction of the desired speech organs from these images. A large variety of algorithms have been developed over the last few decades trying to handle this issue [4-6]. They mainly can be categorized as data-driven approach such as snake-like methods and modeldriven approach that use the prior knowledge to complete the task. Both categories have their own pros and cons. For data-driven approach, each image has to be given an initial shape before extracting the shape, which could not be fully automatic. The model-based approach has to be trained by a training set, which has to be labeled manually beforehand. Active Appearance Models (AAM) is one of the model-based approaches, which has been shown that it has great promising for automatically tracking objects from images. As MRI database of speech has a large number of images for recording articulatory movements, it is worthy to label a small training set for automatically extracting the shape from remaining images. AAM was developed by Cootes et .al [7-10], which is a statistical point distribution model (PDM). AAM has demonstrated its capability for image segmentation [11]. It is able to automatically learn the parameters of the PDMs from sets of corresponding landmarks as well as incorporating the shape and boundary gray-level information. An AAM describes the image appearance and shape of object of interest by obtaining a statistical shape-appearance model from a training set. AAM minimize the difference between the synthesized image from the model and an unseen image by tuning the model parameters, when it is applied to image interpretation or segmentation. AAM has demonstrated high robust for segmentation in Cardiac MRI images and face feature extraction. The articulators such as tongue, soft palate and lips, however, are highly deformable organs than face and heart. In this research we adopt AAM as a mean for extracting tongue and palate contours from MRI image sequences as well as the contours of the profile view of upper and lower lips. International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019) Copyright © 2019, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/). Advances in Intelligent Systems Research, volume 168
What problem does this paper attempt to address?