Abstract:In this article, we present the results of automatic extraction of speech articulator contours from Magnetic Resonance Imaging movie by employing the Active Appearance Model. An Active Appearance Model based framework is proposed to deal with the high nonlinear property of articulatory deformation during articulation, which demonstrates the advantage for tracking articulators shape from noisy MRI images. The extraction of the vocal tract contour was carried on MRI movies from Chinese subjects. The performance of this framework was evaluated by comparing manually labeled contours with automatically extracted ones. The average error is around 2.1 pixels. Introduction Speech is one of the most important functions of human communication. However, the mechanism of speech production is far from being fully discovered. The morphological and dynamic aspects of speech organs are the essential for understanding the knowledge of speech dynamic. Advanced imaging and image processing technologies are important for this research field. Magnetic Resonance Imaging (MRI) is able to produce high-resolution images of human articulators. This function makes MRI currently one of the most promising means for speech research and hence has been widely used in study speech production [1-3]. A set of databases of MRI image of human speech organs have been available for various purposes. A necessary procedure to use such databases, however, is a successful extraction of the desired speech organs from these images. A large variety of algorithms have been developed over the last few decades trying to handle this issue [4-6]. They mainly can be categorized as data-driven approach such as snake-like methods and modeldriven approach that use the prior knowledge to complete the task. Both categories have their own pros and cons. For data-driven approach, each image has to be given an initial shape before extracting the shape, which could not be fully automatic. The model-based approach has to be trained by a training set, which has to be labeled manually beforehand. Active Appearance Models (AAM) is one of the model-based approaches, which has been shown that it has great promising for automatically tracking objects from images. As MRI database of speech has a large number of images for recording articulatory movements, it is worthy to label a small training set for automatically extracting the shape from remaining images. AAM was developed by Cootes et .al [7-10], which is a statistical point distribution model (PDM). AAM has demonstrated its capability for image segmentation [11]. It is able to automatically learn the parameters of the PDMs from sets of corresponding landmarks as well as incorporating the shape and boundary gray-level information. An AAM describes the image appearance and shape of object of interest by obtaining a statistical shape-appearance model from a training set. AAM minimize the difference between the synthesized image from the model and an unseen image by tuning the model parameters, when it is applied to image interpretation or segmentation. AAM has demonstrated high robust for segmentation in Cardiac MRI images and face feature extraction. The articulators such as tongue, soft palate and lips, however, are highly deformable organs than face and heart. In this research we adopt AAM as a mean for extracting tongue and palate contours from MRI image sequences as well as the contours of the profile view of upper and lower lips. International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019) Copyright © 2019, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/). Advances in Intelligent Systems Research, volume 168

Tongue Shape Synthesis Based on Active Shape Model.

Assessment of velopharyngeal function with dual-planar high-resolution real-time spiral dynamic MRI.

Active Appearance Model Based Contour Extraction for MRI Images of Human Tongue

A Study of Mandarin Chinese Using X-Ray and MRI

A Multilinear Tongue Model Derived from Speech Related MRI Data of the Human Vocal Tract

A Speech-Driven 3-D Tongue Model with Realistic Movement in Mandarin Chinese.

A mass-spring tongue model with efficient collision detection and response during speech

An Improved 3D Geometric Tongue Model

Geometrical Analysis of the Tongue Muscles Based on MRI and Functional Modeling of the Tongue

Tongue Shape Variation Model for Simulating Mandarin Chinese Articulation.

A novel 3D geometric articulatory model

Extraction of Tongue Contour in Real-Time Magnetic Resonance Imaging Sequences

Improvement of a Physiological Articulatory Model for Synthesis of Vowel Sequences

MRI-based 3D Model of Spoken Lips

Speech production of vowel sequences using a physiological articulatory model

MRI Observation of Dynamic Articulatory Movements Using a Synchronized Sampling Method

Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis

The Modeling of Tongue Tip in Standard Chinese Using MRI

MRI Analyses of the Effects of Relative Tongue Size on Individual Articulatory Differences

Acoustic VR in the Mouth: A Real-Time Speech-Driven Visual Tongue System.

A Novel Method for Constructing 3d Geometric Articulatory Models