Can We Trust AI-Generated Educational Content? Comparative Analysis of Human and AI-Generated Learning Resources

Paul Denny,Hassan Khosravi,Arto Hellas,Juho Leinonen,Sami Sarsa
2023-07-03
Abstract:As an increasing number of students move to online learning platforms that deliver personalized learning experiences, there is a great need for the production of high-quality educational content. Large language models (LLMs) appear to offer a promising solution to the rapid creation of learning materials at scale, reducing the burden on instructors. In this study, we investigated the potential for LLMs to produce learning resources in an introductory programming context, by comparing the quality of the resources generated by an LLM with those created by students as part of a learnersourcing activity. Using a blind evaluation, students rated the correctness and helpfulness of resources generated by AI and their peers, after both were initially provided with identical exemplars. Our results show that the quality of AI-generated resources, as perceived by students, is equivalent to the quality of resources generated by their peers. This suggests that AI-generated resources may serve as viable supplementary material in certain contexts. Resources generated by LLMs tend to closely mirror the given exemplars, whereas student-generated resources exhibit greater variety in terms of content length and specific syntax features used. The study highlights the need for further research exploring different types of learning resources and a broader range of subject areas, and understanding the long-term impact of AI-generated resources on learning outcomes.
Human-Computer Interaction,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in online learning platforms, the demand for large - scale generation of high - quality educational resources is growing day by day. Large language models (LLMs) are regarded as a promising solution for quickly creating learning materials and can relieve the burden on teachers. In this study, the author explored the potential of LLMs to generate learning resources in the context of an introductory programming course by comparing the quality of resources generated by LLMs with those created by students as part of their learner - contribution activities. Specifically, the study aimed to evaluate: 1. **RQ1: Given the same guiding examples, how do student - generated learning resources differ from AI - generated learning resources in terms of overall length and the presence of grammatical features?** - This study measured the comprehensiveness of these resources by analyzing the retention of C keyword usage in code examples generated by students and AI, as well as the length (number of characters) of the "code" and "explanation" parts. 2. **RQ2: How do students view the correctness and usefulness of student - generated content compared to AI - generated content?** - The study also examined students' responses to each criterion (code correctness, explanation correctness, and example usefulness), as well as their overall judgment of resource quality, in order to understand students' views on these two types of resources. Through blind evaluation, students evaluated the resources without knowing their sources. The results of this study indicate that, from the students' perspective, the quality of AI - generated resources is equivalent to that of student - generated resources. This shows that in some cases, AI - generated resources can be used as effective supplementary materials. However, AI - generated resources tend to closely imitate the given examples, while student - generated resources show greater diversity in terms of content length and specific grammatical features used. The study emphasizes the need for further exploration of different types of learning resources and broader subject areas, as well as understanding the long - term impact of AI - generated resources on students' learning outcomes.