Quality Assurance of Database Centric Applications
Tao Xie,Kunal Taneja
2013-01-01
Abstract:Software faults cost USA’s economy over 59 billion dollars. The cost can be brought down by making software testing more effective in finding and fixing faults quickly. Software developers test an application by writing tests with inputs that achieve high structural code coverage. However, manual testing is labor intensive and time consuming. To reduce the efforts of manual software testing, there exist tools for automated test generation. These tools can generate tests that achieve high code coverage. The generated tests can be used to find faults in the software application under test. In addition, these tests can be used as regression tests, i.e., when changes are made to the software application under test, these tests can be executed on the modified application to find regression faults.Database Centric Applications (DCAs) are getting more and more popular in enterprise computing. DCAs consist of a front-end application (FA) that interacts with a back-end database (DB). Testing of DCAs not only requires testing the FA with inputs that achieve high code coverage but also preparing necessary states in its DB to cover various branches in the FA, in short as FA branches, that are dependent on the DB. When existing test generation tools are used to test the DCAs, the tools can generate tests for the FA but cannot generate inputs for the DB. As a result, various FA branches that are dependent on the DB cannot be covered. In addition, the tools are not efficient and effective for regression testing of the FA of the DCAs.This dissertation addresses various problems in test generation for DCAs. First, often testing of DCAs is outsourced to testing centers and is conducted by test engineers there. When proprietary DCAs are released, their DB should also be made available to test engineers. However, different data privacy laws prevent organizations from sharing the records in the DB with test centers because the DB can contain sensitive information. As a result, various FA branches that depend on the DB cannot be covered leading to more regression faults undetected. Second, even if the DB can be released to the test engineers, the DB has insufficient (or no) records in it for effective regression testing of the DCA. As a result, for effective regression testing of a DCA, a test generation tool not only needs to generate inputs for the FA but also generate inputs for the DB to cover various FA branches that are dependent on the DB. Hence, a test generation tool needs to bridge the gap between the FA and the DB. Third, existing test generation tools generate tests to achieve high code coverage of the FA but not specifically to find behavioral differences between the two versions of the FA. As a result, these approaches are ineffective and inefficient for regression test generation.In this dissertation, we propose a framework that addresses the preceding problems in test generation for DCAs. First, we present an approach, called PRIEST, for anonymizing the records in the DB of a DCA such that the anonymized DB can be released to the test engineers (conducting the regression testing) without leaking sensitive information in the DB. With PRIEST, organizations can balance the level of privacy with needs of regression testing. Second, we present an approach, called MODA, that facilitates the generation of inputs for the DB of a DCA to cover various FA branches that are dependent on the DB. To generate tests efficiently, MODA can use the existing records in the DB or the anonymized DB generated by PRIEST. Third, we present approaches, called DiffGen and eXpress, for effective and efficient regression test generation of the FA of a DCA.