Performance Comparison of Large Language Models on VNHSGE English Dataset: OpenAI ChatGPT, Microsoft Bing Chat, and Google Bard

Xuan-Quy Dao
2023-07-20
Abstract:This paper presents a performance comparison of three large language models (LLMs), namely OpenAI ChatGPT, Microsoft Bing Chat (BingChat), and Google Bard, on the VNHSGE English dataset. The performance of BingChat, Bard, and ChatGPT (GPT-3.5) is 92.4\%, 86\%, and 79.2\%, respectively. The results show that BingChat is better than ChatGPT and Bard. Therefore, BingChat and Bard can replace ChatGPT while ChatGPT is not yet officially available in Vietnam. The results also indicate that BingChat, Bard and ChatGPT outperform Vietnamese students in English language proficiency. The findings of this study contribute to the understanding of the potential of LLMs in English language education. The remarkable performance of ChatGPT, BingChat, and Bard demonstrates their potential as effective tools for teaching and learning English at the high school level.
Computation and Language,Human-Computer Interaction
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to evaluate the performance of three large - language models (namely ChatGPT of OpenAI, Bing Chat of Microsoft, and Bard of Google) in the Vietnam High School Graduation English Examination (VNHSGE English dataset), and compare it with the English proficiency of Vietnamese students. Specifically, the researchers proposed the following research questions: 1. **Research Question 1 (RS1)**: How do ChatGPT, BingChat, and Bard perform in the Vietnam high school English examination? 2. **Research Question 2 (RS2)**: How do these large - language models compare with Vietnamese high school students in terms of English proficiency? 3. **Research Question 3 (RS3)**: What are the potential application values of large - language models in high school English teaching in Vietnam? By answering these questions, the paper aims to comprehensively evaluate the potential of these large - language models in the field of education, especially whether they can be used as effective teaching tools to help improve the English proficiency of Vietnamese students.