Application of missing data approaches in software testing research

Qin Liu,Wen Qian,Atanasov Atanas
DOI: https://doi.org/10.1109/ICECC.2011.6066775
2011-01-01
Abstract:This research came from a school-enterprise cooperation program, which intends to improve the quality of software testing process in SAP IDDC team. In data collection stage, it was found that one of six measurements, number of test case executions, had 77% data missing in monotone due to the newly adopted test case tracking tool. Three common imputation approaches, Multiple Imputation (MI), Expectation Maximization (EM) and Regression Imputation, were therefore applied to 12 selected imputation models to generate complete data set for further analysis. A comparison analysis was conducted to examine the effects of imputation. The result shows that MI has certain advantages compared with EM and Regression Imputation. The imputation model that contains all measurements, which are related to imputed variable and converted based on domain knowledge, is superior to other models.
What problem does this paper attempt to address?