Abstract:This paper provides a comprehensive explanation of the theoretical foundations of multimodal discourse analysis theory as applied to speaking instructional design. The specific application of multimodal theory in the teaching of elementary English speaking classrooms is explored through the teaching design of elementary English speaking classrooms, the teaching implementation of multimodal teaching design is carried out, and the effect of the teaching practice of elementary English speaking guided by multimodal discourse analysis theory is comprehensively evaluated through classroom observation method, questionnaire survey method, and interview method, combined with the teaching evaluation and teaching implementation effect, which is the multimodal teaching design. The paper also summarizes the findings and shortcomings of the study. Through the teaching design and implementation, the advantages of multimodal teaching are obvious; it can combine with modern advanced teaching techniques to create more realistic communicative situations in the classroom, gather and present various modal resources and information, and ensure rich and diverse language input; students can receive various sensory stimuli in the classroom, deepen their memory and experience of language, increase the interest of classroom teaching, and improve students’ participation. It also increases the interest of the classroom and enhances students’ participation and motivation. Based on multimodal theory, the author designed a multimodal teaching framework for a semester-long speaking course in the speaking classroom for reference. The fuzzy measures were constructed based on subsets of language segments containing 10 phonemes belonging to the same HDP set. Finally, linguistic scores are given by the Surgeon integral model based on the plausibility of the system and the fuzzy measures. The experimental results based on Sphinx-4 show that the evaluation model yields plausible and stable evaluation results for the 3 test sets at an average correct recognition rate of 84.7% of phonemes.

Design of the Oral English Teaching Method Based on Multimodal Feature Fusion

Sentiment Analysis Using Deep Robust Complementary Fusion of Multi-Features and Multi-Modalities.

Audio-Visual Speech Enhancement with Deep Multi-modality Fusion

An Automatic Assessment Method for Spoken English Based on Multimodal Feature Fusion

Multi-Feature Intelligent Oral English Error Correction Based on Few-Shot Learning Technology

Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning

Multimodal emotion recognition from facial expression and speech based on feature fusion

Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment Analysis

A Deep Learning-Based Assisted Teaching System for Oral English

Feature Extraction Network with Attention Mechanism for Data Enhancement and Recombination Fusion for Multimodal Sentiment Analysis

Diversified Teaching of English-Chinese Bilingual Courses Based on Integrating Multimodal Discourse Analysis

Integrating both Visual and Audio Cues for Enhanced Video Caption

A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning

Rethinking the constraints of multimodal fusion: case study in Weakly-Supervised Audio-Visual Video Parsing

Research on Multifeature Intelligent Correction of Spoken English

MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention

Multimodal Sentiment Analysis in Realistic Environments Based on Cross-Modal Hierarchical Fusion Network

Speech recognition using an english multimodal corpus with integrated image and depth information

Multimodal transformer augmented fusion for speech emotion recognition

A Multimodal Sentiment Analysis Approach Based on a Joint Chained Interactive Attention Mechanism

Exploring multimodal data analysis for emotion recognition in teachers’ teaching behavior based on LSTM and MSCNN