Detailed Technique Report on XMLSnippet Yiqi

Lu,Sheng Huang,Yanghua Xiao
2011-01-01
Abstract:XMLSnippet is a tool to recommend XML snippets automatically by mining structural patterns from application to assist programmer to generate proper XML configuration during product phase. These actions are realized by mining closed frequent tree patterns from existing projects without awareness of the syntax of the specific frameworks. XMLSnippet is built on two major techniques that are borrowed from database communities: closed frequent XML tree pattern mining and sub-tree query. In this paper, we will describe those major techniques used in XMLSnippet offline mining in detail. 1 The framework of XMLSnippet In the design of XMLSnippet, we want to achieve following objectives. 1. The tool should be syntax neutral to support frameworks with different syntax. The tool should be able to extract generalized patterns over all kinds of frameworks. 2. The tool should be able to provide online intelligent coding suggestions with reusable XML Snippet when the programmer is editing the XML configuration files. The architecture of XMLSnippet is shown in Figure 1. The architecture consists of three major phases: pre-processing, offline mining and online query. In phase 1, application repositories are first preprocessed, and then the result (XML Repository) will be fed to closed frequent sub-tree mining. Closed sub-tree patterns in the XML configuration files are of special interest to us since they contains enough information of editing rules in framework-based programming. All the resulting patterns are stored in XML tree pattern database, which will be indexed by prefix tree. Sample source codes will be indexed by inverted index that will be used to speedup CQ. The cores of XMLSnippet are closed frequent tree mining and prefix tree/source code indexing. Frequent tree pattern mining distinguishes our tool from other code recommendation tools that utilizes frequent item set patterns[3], or sequential patterns[4]. Online query phase is responsible for providing effective online assistance for programmers.
What problem does this paper attempt to address?