Ontology based on focused crawler

ZHENG Jian-zhen,LIN Kun-hui,ZHOU Chang-le,KANG Kai
DOI: https://doi.org/10.3969/j.issn.1671-9352.2006.03.022
2006-01-01
Abstract:Focused crawler can fetch large quantities of domain resources from the Web in a short time.It is very helpful in both foused search engines and data mining companies.In order to overcome the deficiency of topic filtering strategy based on widly used nowadays,the paper proposed a topic filtering stratege based on concept elicited by concept congregation idea.The paper also proposed an authority modified weight calculation formula based on different importance of Web page information.By doing this,real time Web page filtering based on concept can be achieved.In the hope of improving focused crawler's work efficiency more,the paper also proposed a link forecast algorithm.At last,the comparative experiment shows that the strategies proposed in this paper are pratical
What problem does this paper attempt to address?