Implementation of a Virtual Assistant System Based on Deep Multi-modal Data Integration

Sungdae Baek,Jonghong Kim,Junwon Lee,Minho Lee
DOI: https://doi.org/10.1007/s11265-022-01829-5
2023-01-15
Journal of Signal Processing Systems
Abstract:In this study, we propose a virtual assistant system that is applied to real life using signal processing and deep learning. First, the overall structure of the proposed system that integrates and controls various modules is introduced, after which we present a multi-modal fusion module that provides services to users. It integrates a natural language processing module for interpreting Korean chatbots and a behavior recognition module for understanding user behavior using an RGB camera. In addition, a hand gesture recognition module was utilized to understand the user's intentions using depth and RGB images. We explain the implementation of a customized service system with several parts: i) a user interface module that interacts with the user, ii) a face recognition module that distinguishes different users, and iii) a voice processing module that can replace the input and output methods through a keyboard and monitor. To check the performance of each module, a testbed was configured in an office environment. Through test results, we successfully demonstrate the realization of the proposed system in real life Finally, we list the challenges discovered during the operation of this system and suggest directions for further research.
computer science, information systems,engineering, electrical & electronic
What problem does this paper attempt to address?