Towards Next Generation Web Information Retrieval

Wei-Ying Ma,Hongjiang Zhang,Hsiao-Wuen Hon
DOI: https://doi.org/10.1007/978-3-540-30480-7_3
2004-01-01
Abstract:Today search engines have become one of the most critical applications on the Web, driving many important online businesses that connect people to information. As the Web continues to grow its size with a variety of new data and penetrate into every aspect of people’s life, the need for developing a more intelligent search engine is increasing. In this talk, we will briefly review the current status of search engines, and then present some of our recent works on building next generation web search technologies. Specifically, we will talk about how to extract data records from web pages using vision-based approach, and introduce new research opportunities in exploring the complementary properties between the surface Web and the deep Web to mutually facilitate the processes of web information extraction and deep web crawling. We will also present a search prototype that data-mines deep web structure to enable one-stop search of multiple online web databases.
What problem does this paper attempt to address?