Source Code Preparing Data Transaction Database Mining Inferring Rules Detecting Violations Function Entropy Bugs Negative Rules Frequent and Infrequent Itemsets

Pan Bian,Bin Liang,Wenchang Shi,Jianjun Huang
2018-01-01
Abstract:Inferring programming rules from source code based on datamining techniques has been proven to be effective to detect software bugs. Existing studies focus on discovering positive rules in the form of A ⇒ B, indicating that when operation A appears, operation B should also be here. Unfortunately, the negative rules (A ⇒ ¬B), indicating the mutual suppression or conflict relationships among program elements, have not gotten the attention they deserve. In fact, violating such negative rules can also result in serious bugs. In this paper, we propose a novel method called NAR-Miner to automatically extract negative association programming rules from large-scale systems, and detect their violations to find bugs. However, mining negative rules faces a more serious rule explosion problem than mining positive ones. Most of the obtained negative rules are uninteresting and can lead to unacceptable false alarms. To address the issue, we design a semantics-constrained mining algorithm to focus rule mining on the elements with strong semantic relationships. Furthermore, we introduce information entropy to rank candidate negative rules and highlight the interesting ones. Consequently, we effectively mitigate the rule explosion problem. We implement NAR-Miner and apply it to a Linux kernel (v4.12-rc6). The experiments show that the uninteresting rules are dramatically reduced and 17 detected violations have been confirmed as real bugs and patched by kernel community. We also apply NAR-Miner to PostgreSQL, OpenSSL and FFmpeg and discover six real bugs.
What problem does this paper attempt to address?