Integrating Large Language Models with Multimodal Virtual Reality Interfaces to Support Collaborative Human-Robot Construction Work

Somin Park,Carol C. Menassa,Vineet R. Kamat
2024-04-04
Abstract:In the construction industry, where work environments are complex, unstructured and often dangerous, the implementation of Human-Robot Collaboration (HRC) is emerging as a promising advancement. This underlines the critical need for intuitive communication interfaces that enable construction workers to collaborate seamlessly with robotic assistants. This study introduces a conversational Virtual Reality (VR) interface integrating multimodal interaction to enhance intuitive communication between construction workers and robots. By integrating voice and controller inputs with the Robot Operating System (ROS), Building Information Modeling (BIM), and a game engine featuring a chat interface powered by a Large Language Model (LLM), the proposed system enables intuitive and precise interaction within a VR setting. Evaluated by twelve construction workers through a drywall installation case study, the proposed system demonstrated its low workload and high usability with succinct command inputs. The proposed multimodal interaction system suggests that such technological integration can substantially advance the integration of robotic assistants in the construction industry.
Robotics,Human-Computer Interaction
What problem does this paper attempt to address?
The paper aims to address the issue of intuitive communication interfaces in Human-Robot Collaboration (HRC) within the construction industry. Specifically, the research objectives include: 1. **Propose a multimodal interaction method**: Combine voice commands and hand controller inputs to enhance intuitive communication between human workers and construction robots. 2. **Design an integration strategy for software solutions**: Integrate different software systems (such as Robot Operating System, ROS; Building Information Modeling, BIM; and game engines supported by large language models) to implement the aforementioned multimodal interaction method. 3. **Validate the proposed solution through user research**: Evaluate the user experience of 12 construction workers through a practical case study—drywall installation task, to verify the system's low workload and high usability. To achieve these goals, the research team developed a multimodal interaction system based on a Virtual Reality (VR) environment. This system allows users to specify work tasks through voice and controller inputs and utilizes large language models as virtual assistants to facilitate bidirectional communication. Additionally, by integrating BIM data, the system can obtain information about the work objects, further improving the efficiency and accuracy of the interaction. In this way, the researchers hope to advance the effective integration of robotic assistants in the construction industry and improve the efficiency and safety of human-robot collaboration.