A Critical Analysis of PSP Data Quality: Results from a Case Study

Philip M. Johnson,Anne M. Disney
DOI: https://doi.org/10.1023/a:1009801218527
IF: 3.762
1999-01-01
Empirical Software Engineering
Abstract:The Personal Software Process (PSP) is used by software engineers to gather and analyze data about their work. Published studies typically use data collected using the PSP to draw quantitative conclusions about its impact upon programmer behavior and product quality. However, our experience using PSP led us to question the quality of data both during collection and its later analysis. We hypothesized that data quality problems can make a significant impact upon the value of PSP measures—significant enough to lead to incorrect conclusions regarding process improvement. To test this hypothesis, we built a tool to automate the PSP and then examined 89 projects completed by ten subjects using the PSP manually in an educational setting. We discovered 1539 primary errors and categorized them by type, subtype, severity, and age. To examine the collection problem we looked at the 90 errors that represented impossible combinations of data and at other less concrete anomalies in Time Recording Logs and Defect Recording Logs. To examine the analysis problem we developed a rule set, corrected the errors as far as possible, and compared the original and corrected data. We found significant differences for measures such as yield and the cost-performance ratio, confirming our hypothesis. Our results raise questions about the accuracy of manually collected and analyzed PSP data, indicate that integrated tool support may be required for high quality PSP data analysis, and suggest that external measures should be used when attempting to evaluate the impact of the PSP upon programmer behavior and product quality.
computer science, software engineering
What problem does this paper attempt to address?