Inferring the Data Access from the Clients of Generic APIs

Yanchun Sun,Gang Huang,Hui Song,Yingfei Xiong
DOI: https://doi.org/10.1109/icsm.2012.6405297
2012-01-01
Abstract:Many programs access external data sources through generic APIs. The class hierarchy of such a generic API does not reflect the schema of any particular data source, and thus it is hard to clarify what data an API client accesses and how it obtains them. This makes it difficult to maintain the API clients. In this paper, we show that the data access of an API client can be recovered through static analysis on the client's source code. We provide a formal and intuitive way to represent the data access, as a graph of so-called summoning snippets. Each snippet stands for a type of data accessed by the client, and carries the code slice from the client about how to obtain the data via the API. We provide an automated approach to inferring a complete and well-simplified set of summoning snippets from the client source code, based on points-to analysis and code slicing. We implement this approach as a development assistant tool, and evaluate it on eight open source data processing programs, with average precision and recall of 89% and 95%, respectively. Further inspection of these clients, as well as a user study about writing data accessing code on their data sources, show that the inference results are useful in the inspection of existing clients and the development of new data access logics.
What problem does this paper attempt to address?