Large Language Models in Computer Science Education: A Systematic Literature Review

Nishat Raihan,Mohammed Latif Siddiq,Joanna C.S. Santos,Marcos Zampieri
2024-10-22
Abstract:Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, these models have extended their capabilities to coding tasks, bridging the gap between natural languages (NL) and programming languages (PL). Foundational models such as the Generative Pre-trained Transformer (GPT) and LLaMA series have set strong baseline performances in various NL and PL tasks. Additionally, several models have been fine-tuned specifically for code generation, showing significant improvements in code-related applications. Both foundational and fine-tuned models are increasingly used in education, helping students write, debug, and understand code. We present a comprehensive systematic literature review to examine the impact of LLMs in computer science and computer engineering education. We analyze their effectiveness in enhancing the learning experience, supporting personalized education, and aiding educators in curriculum development. We address five research questions to uncover insights into how LLMs contribute to educational outcomes, identify challenges, and suggest directions for future research.
Machine Learning,Human-Computer Interaction
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the application and its impact of large - language models (LLMs) in computer science education. Specifically, through a systematic literature review (SLR), the author aims to explore the following aspects: 1. **Educational Levels**: At which educational levels (such as undergraduate, postgraduate, etc.) are LLMs used, and the effectiveness and applicability of these models at different educational stages. 2. **Computer Science Sub - disciplines**: In which specific computer science sub - disciplines (such as programming introduction, software testing, etc.) are LLMs studied, and the current research status and future directions of these sub - disciplines. 3. **Research Methods**: Methodologies for studying LLMs in computer science education, including experimental design, data analysis techniques, etc. 4. **Programming Languages**: In the research involving LLMs, what are the most commonly used programming languages, and the impact of the choice of these languages on the research results. 5. **Large - language Models**: Which specific large - language models are used in these studies, and the performance and characteristics of these models. By answering these questions, the author hopes to reveal the actual application effects of LLMs in computer science education, the challenges faced, and provide directions for future related research.