Abstract:This study investigates the application effectiveness of the Large Language Model (LLMs) ChatGLM in the automated generation of high school information technology exam questions. Through meticulously designed prompt engineering strategies, the model is guided to generate diverse questions, which are then comprehensively evaluated by domain experts. The evaluation dimensions include the Hitting(the degree of alignment with teaching content), Fitting (the degree of embodiment of core competencies), Clarity (the explicitness of question descriptions), and Willing to use (the teacher's willingness to use the question in teaching). The results indicate that ChatGLM outperforms human-generated questions in terms of clarity and teachers' willingness to use, although there is no significant difference in hit rate and fit. This finding suggests that ChatGLM has the potential to enhance the efficiency of question generation and alleviate the burden on teachers, providing a new perspective for the future development of educational assessment systems. Future research could explore further optimizations to the ChatGLM model to maintain high fit and hit rates while improving the clarity of questions and teachers' willingness to use them.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **To explore the application effects of large - language models (LLMs), especially ChatGLM, in automatic question - generation for high - school information technology courses, and to evaluate whether it can be comparable to manual question - generation**. Specifically, the research uses carefully - designed prompt engineering techniques to guide ChatGLM to generate diverse examination questions, and domain experts conduct a comprehensive evaluation from the following dimensions: 1. **Hitting**: The degree of alignment between the questions and the teaching content. 2. **Fitting**: The extent to which the questions reflect the core capabilities. 3. **Clarity**: Whether the description of the questions is clear and unambiguous. 4. **Willing to use**: Whether teachers are willing to use these questions in teaching. By comparing and analyzing the performance of questions generated by ChatGLM and those generated manually in each of the above dimensions, the research aims to verify the practical application potential of LLMs in simulated question - generation. This research not only helps to improve the intelligence level of the automatic question - generation system, but also provides a new perspective and practical basis for the future development of educational technology. ### Research Background High - school information technology courses cover a wide range of knowledge points, which brings a relatively large teaching burden to information technology teachers. In recent years, the application of large - language models (LLMs), especially ChatGLM, has provided new possibilities for solving this problem. Through carefully - designed prompts, LLMs can generate questions that meet the assessment requirements of high - school information technology, thus providing teachers with a more efficient and convenient teaching tool. ### Research Objectives This research is committed to in - depth exploration of the performance of LLMs, especially ChatGLM, in the task of automatic question - generation for high - school information technology, and to evaluate whether it can match the ability of manual question - generators. Through prompt engineering techniques, the research guides LLMs to generate corresponding examination questions, and domain experts conduct detailed evaluations from multiple dimensions, including hitting, fitting, clarity, and willingness to use. Through comparative analysis of different indicators, the research aims to verify the application potential of LLMs in simulated question - generation, and to provide a theoretical basis and practical reference for promoting educational informatization. ### Main Findings The research results show that although ChatGLM has no significant difference from manual question - generation in terms of hitting and fitting, it performs excellently in terms of clarity and teachers' willingness to use. This indicates that ChatGLM has the potential to improve the efficiency of question - generation, reduce the work burden of teachers, and provide new ideas for the development of future educational assessment systems.

Research on the Application of Large Language Models in Automatic Question Generation: A Case Study of ChatGLM in the Context of High School Information Technology Curriculum

Application of Large Language Models in Automated Question Generation: A Case Study on ChatGLM's Structured Questions for National Teacher Certification Exams

Comparison of Large Language Models for Generating Contextually Relevant Questions

Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models

Embracing AI in Education: Understanding the Surge in Large Language Model Use by Secondary Students

How Teachers Can Use Large Language Models and Bloom's Taxonomy to Create Educational Quizzes

Automated Educational Question Generation at Different Bloom's Skill Levels using Large Language Models: Strategies and Evaluation

The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models

The Future of Learning: Large Language Models through the Lens of Students

How Useful are Educational Questions Generated by Large Language Models?

Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course

Large Language Model-Driven Classroom Flipping: Empowering Student-Centric Peer Questioning with Flipped Interaction

Can Large Language Models Make the Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education

Beyond Traditional Teaching: The Potential of Large Language Models and Chatbots in Graduate Engineering Education

Few-shot is enough: exploring ChatGPT prompt engineering method for automatic question generation in english education

Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges

Leveraging Large Language Models to Generate Course-specific Semantically Annotated Learning Objects

Assessing Large Language Models in Mechanical Engineering Education: A Study on Mechanics-Focused Conceptual Understanding

CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data

Evaluating Large Language Models in Analysing Classroom Dialogue

Practical and Ethical Challenges of Large Language Models in Education: A Systematic Scoping Review