Supporting maintenance and testing for AI functions of mobile apps based on user reviews: An empirical study on plant identification apps

Chuanqi Tao,Hongjing Guo,Jingxuan Zhang,Zhiqiu Huang
DOI: https://doi.org/10.1002/smr.2444
2022-02-27
Abstract:Despite the tremendous development of artificial intelligence (AI)‐based mobile apps, they suffer from quality issues. Data‐driven AI software poses challenges for maintenance and quality assurance. Metamorphic testing has been successfully adopted to AI software. However, most previous studies require testers to manually identify metamorphic relations in an ad hoc and arbitrary manner, thereby encountering difficulties in reflecting real‐world usage scenarios. Previous work showed that information available in user reviews is effective for maintenance and testing tasks. Yet, there is a lack of studies leveraging reviews to facilitate AI function maintenance and testing activities. This paper proposes METUR, a novel approach to supporting maintenance and testing for AI functions based on reviews. Firstly, METUR automatically classifies reviews that can be exploited for supporting AI function maintenance and evolution activities. Then, it identifies test contexts from reviews in the usage scenario category. METUR instantiates the metamorphic relation pattern for deriving concrete metamorphic relations based on test contexts. The follow‐up test dataset is constructed for conducting metamorphic testing. Empirical studies on plant identification apps indicate that METUR effectively categorizes reviews that are related to AI functions. METUR is feasible and effective in detecting inconsistent behaviors by using the metamorphic relations constructed based on reviews.
computer science, software engineering
What problem does this paper attempt to address?