Automatic Model Generation from Documentation for Java API Functions

Juan Zhai,Jianjun Huang,Shiqing Ma,Xiangyu Zhang,Lin Tan,Jianhua Zhao,Feng Qin
DOI: https://doi.org/10.1145/2884781.2884881
2016-01-01
Abstract:Modern software systems are becoming increasingly complex, relying on a lot of third-party library support. Library behaviors are hence an integral part of software behaviors. Analyzing them is as important as analyzing the software itself. However, analyzing libraries is highly challenging due to the lack of source code, implementation in different languages, and complex optimizations. We observe that many Java library functions provide excellent documentation, which concisely describes the functionalities of the functions. We develop a novel technique that can construct models for Java API functions by analyzing the documentation. These models are simpler implementations in Java compared to the original ones and hence easier to analyze. More importantly, they provide the same functionalities as the original functions. Our technique successfully models 326 functions from 14 widely used Java classes. We also use these models in static taint analysis on Android apps and dynamic slicing for Java programs, demonstrating the effectiveness and efficiency of our models.
What problem does this paper attempt to address?