Locating Need-to-Externalize Constant Strings for Software Internationalization with Generalized String-Taint Analysis

Wang, Xiaoyin,Zhang, Lu,Xie, Tao,Mei, Hong
DOI: https://doi.org/10.1109/TSE.2012.40
IF: 7.4
2013-01-01
IEEE Transactions on Software Engineering
Abstract:Nowadays, a software product usually faces a global market. To meet the requirements of different local users, the software product must be internationalized. In an internationalized software product, user-visible hard-coded constant strings are externalized to resource files so that local versions can be generated by translating the resource files. In many cases, a software product is not internationalized at the beginning of the software development process. To internationalize an existing product, the developers must locate the user-visible constant strings that should be externalized. This locating process is tedious and error-prone due to 1) the large number of both user-visible and non-user-visible constant strings and 2) the complex data flows from constant strings to the Graphical User Interface (GUI). In this paper, we propose an automatic approach to locating need-to-externalize constant strings in the source code of a software product. Given a list of precollected API methods that output values of their string argument variables to the GUI and the source code of the software product under analysis, our approach traces from the invocation sites (within the source code) of these methods back to the need-to-externalize constant strings using generalized string-taint analysis. In our empirical evaluation, we used our approach to locate need-to-externalize constant strings in the uninternationalized versions of seven real-world open source software products. The results of our evaluation demonstrate that our approach is able to effectively locate need-to-externalize constant strings in uninternationalized software products. Furthermore, to help developers understand why a constant string requires translation and properly translate the need-to-externalize strings, we provide visual representation of the string dependencies related to the need-to-externalize strings.
What problem does this paper attempt to address?