Research of Vertical Search Engine in News Industry

M. Li,X. J. Gu,Z. X. Yang
DOI: https://doi.org/10.1109/ismot.2012.6679470
2012-01-01
Abstract:In summing up the existing network of reptiles, and full-text retrieval based on theoretical knowledge, conducted a Web crawler optimization algorithm so that it can adapt to the needs of vertical search engines, and then sub-word component of Pango and Lucene.Net build an efficient full-text search functions. The innovation of the paper is the analysis of the characteristics of news sites to integrate its features into the traditional vertical search engines. News site on the information requirements for the characteristics of the network by studying the relevant full-text search framework to multi-threaded data collection and retrieval of the vertical search engine, performance and user experience goals are to achieve a better. The entire system by small and medium news site commissioning tests designed to meet the test show that the crawlers can adapt to the new network news industry efficient and timely collection requirements, Lucene.Net segmentation. The integration of Pango built the content for news and information Full-text retrieval system can achieve the accuracy of search engine queries for information and efficient response time demands, thereby increasing the amount of information and user experience.
What problem does this paper attempt to address?