ELLMA-T: an Embodied LLM-agent for Supporting English Language Learning in Social VR

Mengxu Pan,Alexandra Kitson,Hongyu Wan,Mirjana Prpa
2024-10-03
Abstract:Many people struggle with learning a new language, with traditional tools falling short in providing contextualized learning tailored to each learner's needs. The recent development of large language models (LLMs) and embodied conversational agents (ECAs) in social virtual reality (VR) provide new opportunities to practice language learning in a contextualized and naturalistic way that takes into account the learner's language level and needs. To explore this opportunity, we developed ELLMA-T, an ECA that leverages an LLM (GPT-4) and situated learning framework for supporting learning English language in social VR (VRChat). Drawing on qualitative interviews (N=12), we reveal the potential of ELLMA-T to generate realistic, believable and context-specific role plays for agent-learner interaction in VR, and LLM's capability to provide initial language assessment and continuous feedback to learners. We provide five design implications for the future development of LLM-based language agents in social VR.
Human-Computer Interaction
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve The paper attempts to address the challenges encountered in learning English, particularly the issue that traditional learning tools cannot provide personalized and contextualized learning experiences. Specifically, the paper explores how to utilize large language models (LLM) and embodied conversational agents (ECA) to support English learning in social virtual reality (VR) environments. ### Background and Motivation 1. **Limitations of Traditional Learning Methods**: - Traditional classroom learning often lacks contextualized and personalized learning materials, making it difficult to provide engaging learning experiences. - Common language learning tools (such as mobile applications like Duolingo) are useful but still insufficient to provide real contextualized practice. 2. **Potential of Social Virtual Reality (VR)**: - Social VR platforms (such as VRChat) offer users an immersive environment for real language communication. - However, it is not easy for users to find suitable native speakers to interact with on these platforms, and native speakers may be unwilling or unsuitable to act as language tutors. 3. **Application of Embodied Conversational Agents (ECA)**: - ECAs can simulate human behavior in VR, providing a natural conversational experience. - Recent studies have shown that ECAs can generate realistic, contextualized role-playing scenarios, offering initial assessments and continuous feedback for language learners. ### Research Objectives - **Design and Implementation**: Develop an embodied LLM agent named ELLMA-T to support English learning in social VR environments. - **User Experience Study**: Conduct a user study (N=12) to understand participants' perceptions of ELLMA-T's capabilities and its performance in four English learning tasks: language proficiency assessment, role-playing dialogue generation, feedback generation, and scaffolding ability. - **Design Insights**: Propose design recommendations for future LLM-based ECA designs for language learning in social VR. ### Main Contributions 1. **System Design**: Designed ELLMA-T, an embodied LLM agent to help adult language learners from A1 to C1 levels practice speaking in social VR. 2. **User Study**: Collected participants' perceptions of ELLMA-T's capabilities and its performance in four English learning tasks through semi-structured interviews. 3. **Design Insights**: Proposed design recommendations for embodied LLM agents for language learning in social VR, including the integration of personalized, culturally relevant, and adaptive learning systems. ### Conclusion Preliminary user studies indicate that ELLMA-T, as a "human-like" language tutor, has potential in contextualized language learning. However, challenges such as interruptions in the dialogue flow and insufficient emotional support were also identified, which need further improvement in future research. The design insights extend to the integration of personalized, culturally relevant, and adaptive learning systems, emphasizing the potential for further exploration of LLM-based ECAs in long-term language learning in social VR.