A comparative study on classifying the functions of web page blocks.

Xiangye Xiao,Qiong Luo,Xing Xie,Wei-Ying Ma
DOI: https://doi.org/10.1145/1183614.1183725
2006-01-01
Abstract:In this paper, we study the problem of learning block classification models to estimate block functions. We distinguish general models, which are learned across multiple sites, and site-specific models, which are learned within individual sites. We further consider several factors that affect the learning process and model effectiveness. These factors include the layout features, the content features, the classifiers, and the term selection methods. We have empirically evaluated the performance of the models when the factors are varied. Our main results are that layout features do better than content features for learning both general and site-specific models.
What problem does this paper attempt to address?