Feature Envy Detection Based on Bi-LSTM with Self-Attention Mechanism

Hongze Wang,Jing Liu,JieXiang Kang,Wei Yin,Haiying Sun,Hui Wang
DOI: https://doi.org/10.1109/ispa-bdcloud-socialcom-sustaincom51426.2020.00082
2020-01-01
Abstract:Code Smell refers to suboptimal or harmful structures in the source code that may impede the maintainability of software. It serves as an effective way to detect refactoring opportunities. As the most prevailing smell, Feature Envy and its detection has been deeply explored for many years, which produces massive automated detection methods. Nevertheless, the heuristic-based approach cannot reach a satisfying level, and the machine learning approach still needs further optimization. Recent advances in deep learning inspire the birth of deep learning based approach. In this paper, we define a simpler distance metric as numerical feature and we collect class name and method name as text feature. Then we leverage Bidirectional Long-Short Term Memory (Bi-LSTM) Network with self-attention mechanism to extract semantic distance information in the text part, and we adopt embedding technology to enhance the structure distance information in the numerical part. Combined with the two sophisticatedly designed modules and the final classification module, a more reliable and accurate model is presented. Experimental results on seven open-source Java projects show that our model significantly outperforms existing methods.
What problem does this paper attempt to address?