Novice Use of the Java Programming Language
Neil C. C. Brown,Pierre Weill-Tessier,Maksymilian Sekula,Alexandra-Lucia Costache,Michael Kölling
DOI: https://doi.org/10.1145/3551393
2022-12-29
ACM Transactions on Computing Education
Abstract:Objectives. Java is a popular programming language for use in computing education, but it is difficult to get a wide picture of the issues that it presents for novices; most studies look only at the types or frequency of errors. In this observational study, we aim to learn how novices use different features of the Java language. Participants. Users of the BlueJ development environment have been invited to opt in to anonymously record their activity data for the past 8 years. This dataset is called Blackbox, which was used as the basis for this study. BlueJ users are mostly novice programmers, predominantly male, with a median age of 16 years. Our data subset featured approximately 225,000 participants from around the world. Study Methods. We performed a secondary data analysis that used data from the Blackbox dataset. We examined over 320,000 Java projects collected over the course of 8 years and used source code analysis to investigate the prevalence of various specifically selected Java programming usage patterns. As this was an observational study without specific hypotheses, we did not use significance tests. Instead, we present the results themselves with commentary, having applied seasonal trend decomposition to the data. Findings. We found many long-term trends in the data over the course of the 8 years, most of which were monotonic. There was a notable reduction in the use of the main method (common in Java but unnecessary in BlueJ) and a general reduction in the complexity of the projects. We find that there is only a small number of frequently used types: int, String, double, and Boolean, but also a wide range of other infrequently used types. Conclusions. We find that programming usage patterns gradually change over a long period of time (a period in which the Java language was not seeing major changes) once seasonal patterns are accounted for. Any changes are likely driven by instructors and the changing demographics of programming novices. The novices use a relatively restricted subset of Java, which implies that designers of languages specifically targeted at novices can satisfy their needs with a smaller set of language constructs and features. We provide detailed recommendations for the designers of educational programming languages and supporting development tools.
education, scientific disciplines