Inferring Human Interactions in Meetings: A Multimodal Approach
Zhiwen Yu,Zhiyong Yu,Yusa Ko,Xingshe Zhou,Yuichi Nakamura
DOI: https://doi.org/10.1007/978-3-642-02830-4_3
2009-01-01
Abstract:Social dynamics, such as human interaction is important for understanding how a conclusion was reached in a meeting and determining whether the meeting was well organized. In this paper, a multimodal approach is proposed to infer human semantic interactions in meeting discussions. The human interaction, such as proposing an idea, giving comments, expressing a positive opinion, etc., implies user role, attitude, or intention toward a topic. Our approach infers human interactions based on a variety of audiovisual and high-level features, e.g., gestures, attention, speech tone, speaking time, interaction occasion, and information about the previous interaction. Four different inference models including Support Vector Machine (SVM), Bayesian Net, Naïve Bayes, and Decision Tree are selected and compared in human interaction recognition. Our experimental results show that SVM outperforms other inference models, we can successfully infer human interactions with a recognition rate around 80%, and our multimodal approach achieves robust and reliable results by leveraging on the characteristics of each single modality.