Extraction and Integration of Intensive Web Information Based on XML

Yonggang Yan
2008-01-01
Abstract:For the problem of intensive web information data extraction,one kind of general tree structure extraction rule which suits in the XML structure is proposed.It assigned the pattern of the intensive Web on data extraction conformity in the XML documents.Using the half structure Web information extraction method based on the example studies,the prototype system based on the XML Web inquiry has been developed which can extract the Web page with good effect. It can be applied in the special Web website information extraction directly,and also may be used the data preparation stage in other correlation application.
What problem does this paper attempt to address?