Multimodal Transfer Learning for Oral Presentation Assessment

Su Shwe Yi Tun,Shogo Okada,Hung-Hsuan Huang,Chee Wee Leong
DOI: https://doi.org/10.1109/access.2023.3295832
IF: 3.9
2023-01-01
IEEE Access
Abstract:Oral communication has consistently been ranked as a key skill, with 90 percent of hiring managers and 80 percent of business executives saying it is very important for college graduates to possess, according to a recent survey. Consequently, training and evaluating oral presentation skills remains a priority for educators worldwide, and there are increasing numbers of automated tools developed for providing feedback and assessment of such skills. However, modeling approaches typically require collecting large amounts of data and labels, which can be both expensive and laborious. In this paper, we explore the possibility of transfer learning between two different but related multimodal datasets to benefit the evaluation of oral presentation performance. We utilize knowledge from a job interview dataset as pretraining material and adapt the learned knowledge from the pre-trained model to a small amount of presentation data to improve the learning of the presentation assessment task. We demonstrate the efficacy of our approach, especially in improving performance for inference on small datasets (< 100 data points), and we report our findings. Moreover, we give a comparison between the proposed TL approach and a standard TL method based on a large-scale pre-trained model. Despite the simplicity of our proposed TL approach, the results show that our approach has promise in application to smaller datasets such as ours.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?