Utilizing Web Scraping and Natural Language Processing to Better Inform Pedagogical Practice

Stephanie Lunn,Jia Zhu,Monique Ross
DOI: https://doi.org/10.1109/fie44824.2020.9274270
2020-10-21
Abstract:This research full paper describes how web scraping and natural language processing can be utilized to answer complex questions in computer science education. In this work, we apply connectivism as the theoretical framework, and demonstrate how web scraping can be useful for extrapolating large amounts of data from publicly available web pages to pool data from a wider array of sources and to further knowledge in the field. In addition, we discuss how natural language processing can be used to reliably obtain salient information from textual data, and how it can complement qualitative analysis. To illustrate these techniques in practice, we provide a specific application in which we examine the current trends in the job market for computer science students. The information gathered in this example provides additional areas for educational consideration, such as offering students Python programming language and machine learning. Also, the job postings delineate a clear need for applicants to exhibit programming and testing skills. Although programming may be taught already, testing is widely considered a knowledge deficiency, which suggests that educators should consider placing an increased emphasis on this area to ensure their students are adequately prepared for their career endeavors, and able to transfer the knowledge taught to critically assess and debug their own programs.
What problem does this paper attempt to address?