Extracting and Clustering Main Ideas from Student Feedback Using Language Models

Mihai Masala,Stefan Ruseti,Mihai Dascalu,Ciprian Dobre
DOI: https://doi.org/10.1007/978-3-030-78292-4_23
2021-01-01
Abstract:Feedback mechanisms for academic courses have been widely used to measure students opinions and satisfaction towards different components of a course; concurrently, open-text detailed impressions enable professors to continually improve their course. However, the process of reading through hundreds of student feedback responses across multiple subjects, followed by the extraction of important ideas is very time consuming. In this work, we propose an automated feedback summarizer to extract the main ideas expressed by all students on various components for each course, based on a pipeline integrating state-of-the-art Natural Language Processing techniques. Our method involves the usage of BERT language models to extract keywords for each course, identify relevant contexts for recurring keywords, and cluster similar contexts. We validate our tool on 8,201 feedback responses for 168 distinct courses from the Computer Science Department of University Politehnica of Bucharest for the 2019–2020 academic year. Our approach achieves a size reduction of 59% on the overall volume of text, while only increasing the mean average error when predicting course ratings from student open-text feedback by an absolute value of 0.06.
What problem does this paper attempt to address?