Abstract:XMLSnippet is a tool to recommend XML snippets automatically by mining structural patterns from application to assist programmer to generate proper XML configuration during product phase. These actions are realized by mining closed frequent tree patterns from existing projects without awareness of the syntax of the specific frameworks. XMLSnippet is built on two major techniques that are borrowed from database communities: closed frequent XML tree pattern mining and sub-tree query. In this paper, we will describe those major techniques used in XMLSnippet offline mining in detail. 1 The framework of XMLSnippet In the design of XMLSnippet, we want to achieve following objectives. 1. The tool should be syntax neutral to support frameworks with different syntax. The tool should be able to extract generalized patterns over all kinds of frameworks. 2. The tool should be able to provide online intelligent coding suggestions with reusable XML Snippet when the programmer is editing the XML configuration files. The architecture of XMLSnippet is shown in Figure 1. The architecture consists of three major phases: pre-processing, offline mining and online query. In phase 1, application repositories are first preprocessed, and then the result (XML Repository) will be fed to closed frequent sub-tree mining. Closed sub-tree patterns in the XML configuration files are of special interest to us since they contains enough information of editing rules in framework-based programming. All the resulting patterns are stored in XML tree pattern database, which will be indexed by prefix tree. Sample source codes will be indexed by inverted index that will be used to speedup CQ. The cores of XMLSnippet are closed frequent tree mining and prefix tree/source code indexing. Frequent tree pattern mining distinguishes our tool from other code recommendation tools that utilizes frequent item set patterns[3], or sequential patterns[4]. Online query phase is responsible for providing effective online assistance for programmers.

Efficient schema extraction from large XML documents

Structure Reuse Methodology Based on XML Schema for Functional Level Design: A Case Study in Configuration Register Module

Web mining of relations from XML and construct database schema

An XML Schema based fine-grained SoC reuse methodology

A Rule-Based Information Extraction System for Human-Readable Semi-Structured Scientific Documents

Reverse Engineering XML

Extracting Local Schema from Semistructured Data Based on Graph-Oriented Semantic Model

Detailed Technique Report on XMLSnippet Yiqi

Bottom-up Discovery of Frequent Rooted Unordered Subtrees

Schema Extraction on Semi-structured Data

Xml Structural Similarity Search Using Mapreduce

XMLSnippet: A Coding Assistant for XML Configuration Snippet Recommendation

Improving XML Search by Generating and Utilizing Informative Result Snippets

An Efficient Schema Matching Approach Using Previous Mapping Result Set

Extracting Key Value and Checking Structural Constraints for Validating XML Key Constraints

Automatic Transformation from ER to XML Schema

XML2HBase: Storing and querying large collections of XML documents using a NoSQL database system

A semantic network-based design methodology for XML documents

Tree model guided candidate generation for mining frequent subtrees from XML documents

XTree: A New XML Keyword Retrieval Model

XSeek: A Semantic XML Search Engine Using Keywords.