Abstract:The recently enhanced computing capability and rich sensing functionality on mobile devices lead to the ubiquitous application of speech recognition. Traditional speech recognition records acoustic signals or visual images to interpret speech. However, the acoustic based scheme has many drawbacks. It is easily affected by the environmental noise when users are in the factory or market, and can not be used in a place where people need to be quite such as library. Specifically the current design is not suitable for people with speaking or hearing difficulties. Unfortunately, the visual-based approach is sensitive to fight conditions which shows poor performance in the dark area. As a result, it is necessary to provide an new human-computer interaction channel to assist speech recognition. This paper presents SilentTalk, a non-invasive lip reading system based on ultrasonic Doppler effect The main idea is to generate ultrasonic signals from a mobile phone, then capture the reflections and analyze the fine-grained frequency shift caused by mouth movements. A Frequency Shift Detection Model (FSDM) is proposed to quantify the correlation between frequency variations and mouth movements that form different syllables. SilentTalk then applies a Continuous Lip Reading Model (CLRM) on top of FSDM to realize continuous lip reading. Based on Markov assumption, CLRM effectively combines pronunciation rules and context knowledge to connect isolated syllables to words and sentences. Experiments show that SilentTalk can identify 12 basic mouth motions up to 95.4% accuracy in English. The system can also recognize short sentences up to six words with an average accuracy of 74.8%.

A study on improved hidden Markov models and applications to speech recognition

Dynamic hand gesture recognition using hidden Markov models

Chinese Vowels Recognition Method Based on Non-Homogeneous Hidden Markov Model

Exploring More Representative States of Hidden Markov Model in Optical Character Recognition: A Clustering-Based Model Pre-Training Approach

Improvement of hidden Markov model (HMM) for speech recognition

A tutorial on hidden Markov models and selected applications in speech recognition

Silenttalk: Lip Reading Through Ultrasonic Sensing on Mobile Phones

Real-time Lip Synchronization Based on Hidden Markov Models

The Hidden Markov Model of co-articulation and its application to the continuous speech recognition

Audio-Visual System for Robust Speaker Recognition.

Self-adaptive Design of Hidden Markov Models

Speech Recognition Algorithm Based on Neural Network and Hidden Markov Model

Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading

Recognition of Sequence Lip Images and Its Application

Application of deep learning in Mandarin Chinese lip-reading recognition

HMM-based Lip Reading with Stingy Residual 3D Convolution

FAST LIKELIHOOD COMPUTATION METHOD USING BLOCK-DIAGONAL COVARIANCE MATRICES IN HIDDEN MARKOV MODEL

A hidden Markov optimization model for processing and recognition of English speech feature signals

Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization

Early Facial Expression Recognition Using Hidden Markov Models

Lip Assistant: Visualize Speech For Hearing Impaired People In Multimedia Services