A Tutorial on Teaching Data Analytics with Generative AI

Robert L. Bray
2024-10-25
Abstract:This tutorial addresses the challenge of incorporating large language models (LLMs), such as ChatGPT, in a data analytics class. It details several new in-class and out-of-class teaching techniques enabled by AI. For example, instructors can parallelize instruction by having students interact with different custom-made GPTs to learn different parts of an analysis and then teach each other what they learned from their AIs. For another example, instructors can turn problem sets into AI tutoring sessions, whereby a custom-made GPT guides a student through the problems, and the student uploads the chatlog for their homework submission. For a third example, you can assign different labs to each section of your class and have each section create AI assistants to help the other sections work through their labs. This tutorial advocates the programming in the English paradigm, in which students express the desired data transformations in prose and then use AI to generate the corresponding code. Students can wrangle data more effectively by programming in English than by manipulating in Excel. However, some students will program in English better than others, so you will still derive a robust grade distribution (at least with current LLMs).
Computers and Society,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to effectively integrate large - language models (LLMs), such as ChatGPT, in data analysis courses to improve teaching methods and students' learning experiences. Specifically, the main challenges faced by the author are: 1. **Obsolescence of traditional teaching methods**: With the emergence of AI tools such as ChatGPT, the traditional content and teaching methods of data analysis courses have become no longer applicable. The author found that ChatGPT can easily solve all the problems he designed, making the original course content seem obsolete. 2. **Improving students' productivity and learning outcomes**: The author hopes to enable students to conduct data analysis more efficiently by introducing LLMs, and simplify complex programming tasks through programming in the English paradigm (PIE, Programming in English). This method not only improves students' productivity but also enables them to master advanced data analysis skills more quickly. 3. **Coping with the educational changes brought about by AI**: The author realizes that the introduction of AI is not only a technological progress but also a fundamental shift in the educational model. He needs to explore how to redesign the course in this new environment to ensure that students can fully utilize the advantages of AI while maintaining academic rigor and the fairness of assessment. To address these problems, the author proposes a series of innovative teaching methods and techniques, including but not limited to: - Transforming assignments into AI - tutoring sessions to increase students' participation and satisfaction. - Using the PIE method, allowing students to describe the required data transformation in natural language, and then having the AI generate the corresponding code. - Letting students interact with ChatGPT through hand - drawn charts to generate the corresponding ggplot code. - Teaching specific content by training GPT and evaluating the quality of students' learning by testing these GPTs. - Creating customized GPTs to teach different knowledge points and encouraging students to teach each other. These methods not only improve teaching efficiency but also help students better adapt to data analysis work in the AI era.