How Beginning Programmers and Code LLMs (Mis)read Each Other

Sydney Nguyen,Hannah McLean Babe,Yangtian Zi,Arjun Guha,Carolyn Jane Anderson,Molly Q Feldman

DOI: https://doi.org/10.1145/3613904.3642706

2024-07-08

Abstract:Generative AI models, specifically large language models (LLMs), have made strides towards the long-standing goal of text-to-code generation. This progress has invited numerous studies of user interaction. However, less is known about the struggles and strategies of non-experts, for whom each step of the text-to-code problem presents challenges: describing their intent in natural language, evaluating the correctness of generated code, and editing prompts when the generated code is incorrect. This paper presents a large-scale controlled study of how 120 beginning coders across three academic institutions approach writing and editing prompts. A novel experimental design allows us to target specific steps in the text-to-code process and reveals that beginners struggle with writing and editing prompts, even for problems at their skill level and when correctness is automatically determined. Our mixed-methods evaluation provides insight into student processes and perceptions with key implications for non-expert Code LLM use within and outside of education.

Human-Computer Interaction

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: the interaction problem between novice programmers and large - language models for code generation (Code LLMs) in the process of natural - language - to - code conversion. Specifically, the paper focuses on the challenges faced by non - expert users (especially novice programmers) when using Code LLMs and the strategies they adopt. These challenges include: 1. **Describing Intent**: How to clearly describe in natural language the program functions they want to achieve. 2. **Evaluating Code Quality**: How to judge whether the code generated by Code LLMs is correct. 3. **Modifying Prompts**: When the generated code is incorrect, how to effectively modify their natural - language prompts. Through a large - scale controlled experiment, the paper studied the process of writing and editing prompts by 120 novice programmers in three academic institutions. The experimental design aims at specific steps in the text - to - code generation process, revealing the difficulties that novices encounter in writing and editing prompts, even when these problems are within their skill level and the correctness can be automatically determined. The research adopts a mixed - method evaluation, providing key insights into student processes and perceptions, and is of great significance for non - experts using Code LLMs both inside and outside education.

How Beginning Programmers and Code LLMs (Mis)read Each Other

How Novices Use LLM-Based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment

Evaluating the Effectiveness of LLMs in Introductory Computer Science Education: A Semester-Long Field Study

Interactions with Prompt Problems: A New Way to Teach Programming with Large Language Models

Navigating the Pitfalls: Analyzing the Behavior of LLMs as a Coding Assistant for Computer Science Students—A Systematic Review of the Literature

Substance Beats Style: Why Beginning Students Fail to Code with LLMs

StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code

Promptly: Using Prompt Problems to Teach Learners How to Effectively Utilize AI Code Generators

Prompt Problems: A New Programming Exercise for the Generative AI Era

"Give me the code" -- Log Analysis of First-Year CS Students' Interactions With GPT

How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for Debugging

Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation

Explaining Code with a Purpose: An Integrated Approach for Developing Code Comprehension and Prompting Skills

Exploring the Responses of Large Language Models to Beginner Programmers' Help Requests

Instruct or Interact? Exploring and Eliciting LLMs' Capability in Code Snippet Adaptation Through Prompt Engineering

Using an LLM to Help With Code Understanding

Exploring the Potential of Large Language Models to Generate Formative Programming Feedback

Teach AI How to Code: Using Large Language Models as Teachable Agents for Programming Education

Forgetful Large Language Models: Lessons Learned from Using LLMs in Robot Programming

The Robots are Here: Navigating the Generative AI Revolution in Computing Education

AI-assisted Code Authoring at Scale: Fine-tuning, deploying, and mixed methods evaluation