High-Frequency Keywords to Predict Defects for Android Applications.

Yaqing Fan,Xinya Cao,Jing Xu,Sihan Xu,Hongji Yang
DOI: https://doi.org/10.1109/compsac.2018.10273
2018-01-01
Abstract:Android defect prediction has proved to be useful to reduce the manual testing effort for finding bugs. In recent years, researchers design metrics related to defects and analyze historical information to predict whether files contain defects using machine learning. However, those models learn to predict defects based on the characteristics of programs while ignoring the internal information, e.g., the functional and semantic information within the source code. This paper proposes a model, HIRER, to learn the functional and semantic information to predict whether files contain defects automatically for Android applications. Specifically, HIRER learns internal information within the source code based on the high-frequency keywords extracted from programs' Abstract Syntax Trees (ASTs). It gets rule-based programming patterns from high-frequency keywords and uses Deep Belief Network (DBN), a deep neutral network, to learn functional and semantic features from the programming patterns. We implement a defect testing system with five machine learning techniques based on HIRER to predict defective files in source code automatically. Then, we apply it on four open source Android applications. The results show that learned functional and semantic features can predict more defects than traditional metrics. In different versions of MMS, Gallery2, Bluetooth, Calendar open source applications, HIRER improves the AUC of the predicted results respectively in average.
What problem does this paper attempt to address?