Data Constraint Mining for Automatic Reconciliation Scripts Generation.

Tianxiao Wang,Chen Zhi,Xiaoqun Zhou,Jinjie Wu,Jianwei Yin,Shuiguang Deng
DOI: https://doi.org/10.1145/3597926.3598122
2023-01-01
Abstract:Fund loss is an increasingly critical problem caused by the misbehavior of software, especially in fintech and e-commerce platforms. Data reconciliation is one of the most commonly used approaches in detecting and preventing fund loss by executing reconciliation scripts on data storage systems (e.g., database and cache systems). The core of reconciliation scripts is the data constraints, which can be expressed as implications with two parts: preconditions and assertions. However, due to the complexity and diversity of business, the construction of data constraints and reconciliation scripts usually heavily relies on business experts. To this end, we propose AutoReconciler to mine data constraints from business data and generate reconciliation scripts automatically. It can mine assertions via enhanced symbolic regression, discover preconditions via association rule mining, and generate reconciliation scripts in SQL form. We have performed extensive experiments on the synthesized data. The result shows that our approach outperforms the baseline by a large margin (an average improvement in precision and recall of 22.1% and 51.6%, respectively), especially for complex data constraints. Our solution has been implemented, deployed, and adopted in production, and we conducted several case studies further to confirm the benefits of our solution in industrial scenarios.
What problem does this paper attempt to address?