Utterance-based proposed spot diagnostic system of vocal tract malfunction
Z T Fayed
Abstract:It is not surprising that speech recognition by machine, has received a great deal of attention through the techniques of artificial intelligence (AI), like expert systems to support decisions in various intended fields. One proposal that is based on the expert system paradigm is to diagnose a malfunction of the vocal tract during uttering recommended utterances for this purpose. The choice of these utterances is achieved according to the position and the manner of articulation. Four important features of acoustic analysis of speech are, fundamental frequency, F0, Formants, (F1-F5), amplitude, and the harmonic structure (tone vs. noise). The most Candidate features in the proposed diagnostic system are both fundamental frequency and/or the formants. These are considered to be the Core of the intended work. The throat, mouth and nose as the resonating champers will support this attitude and will affect, negatively, the range of the mentioned frequencies when they are out of the anatomical and/or physical functions. The discrete speech (isolated words) as the most recognizable utterances. Will be considered to put aside the difficulties of both connected and continuous word-based recognition. In a diagnostic systems, the generality is an essential issue, that is to consider "Speaker independent" recognizer which needs more efforts during the system training phase. The paper presents a rough (initial) spotting diagnostic system to be the base for a future detailed system for specific defects of precised organs belonging to the vocal tract. Arabic Vowels as well as some Consonants would be the target, taking into account the age range, that is (20-25) years-aged matures. Different recommended utterances, Arabic segmented alphabetic, focused on various points through the vocal apparatus. Speech related waveforms, as well as the associated fundamental frequencies and formants have been considered in the normal and the Corresponding abnormal Cases. The deviations that appeared in the frequency pattern have indicated the defected articulator that is dominant in the intended utterance production. The results illustrated, would be the base for designing a dedicated hardware unit, which may be reliable for the physicians interesting in this field. Human beings communicate with one another primarily by speech, and speech brings human beings closer together speech sounds travel through the air at the rate of about 330 meter per second, whereas impulses travel a long nerve pathways in the body at a rate of about 60 meter per second. The time it takes for a spoken word to be heard and understood by a listener may be shorter than the time it takes as a neural message to travel to the brain, [1]. Speech not only for human Communication, but it also has many applications in different fields. Some of these applications are machine control commands base systems, speech-to-text, and Text-to-speech systems, natural language-based systems, and medical diagnosis systems for vocal tract malfunction, the issue of this paper. One difficulty of speech based-systems is the fact that not everyone speaks the same way, even those who supposedly speak the same language at the same way. The term dialect is used to refer to this variability and is emphasized in case of uttering with different languages. The above discussion is concerning with the social and emotional variability which can be modified with reasonable efforts. The great variabilities, which are difficulty to be manipulated, are belonging to the inheritance and anatomical aspects. So it is worthy and to propose a methodology that can be used globally inspite of different social communities.