Large‐scale characterization of Java streams
Eduardo Rosales,Matteo Basso,Andrea Rosà,Walter Binder
DOI: https://doi.org/10.1002/spe.3213
2023-06-06
Software - Practice and Experience
Abstract:Java streams are receiving the attention of developers targeting the Java virtual machine (JVM) as they ease the development of data‐processing logic, while also favoring code extensibility and maintainability through a concise and declarative style based on functional programming. Recent studies aim to shedding light on how Java developers use streams. However, they consider only small sets of applications and mainly apply manual code inspection and static analysis techniques. As a result, the large‐scale dynamic analysis of stream processing remains an open research question. In this article, we present the first large‐scale empirical study on the use of streams in Java code exercised via unit tests. We present stream‐analyzer, a novel dynamic program analysis (DPA) that collects runtime information and key metrics, which enable a fine‐grained characterization of sequential and parallel stream processing. We use a fully automatic approach to massively apply our DPA for the analysis of open‐source software projects hosted on GitHub. Our findings advance the understanding of the use of Java streams. Both the scale of our analysis and the profiling of dynamic information enable us to confirm with more confidence the outcome highlighted at a smaller scale by related work. Moreover, our study reports the popularity of many features of the Stream API and highlights multiple findings about runtime characteristics unique to streams, while also revealing inefficient stream processing and stream misuses. Finally, we present implications of our findings for developers of the Stream API, tool builders and researchers, and educators.
computer science, software engineering