A distributed search engine based on a re-ranking algorithm model

Jingyong Wan,Beizhan Wang,Wei Guo,Kang Chen,Jiajun Wang
DOI: https://doi.org/10.1109/iccse.2015.7250325
2015-01-01
Abstract:With the rapid increase of websites and the explosive growth of Internet information, the centralized search engine will face great challenge in mass data processing and mass data storage. However, the distributed search engine can solve the problem effectively. In this paper, we describe the design and implementation of a distributed search engine that is based on Apache Nutch, Solr and Hadoop. Considering users click logs, we propose a re-ranking algorithm based on Lucene scoring. Our experimental results show that our approaches significantly satisfy users' massive data searching demand while maintaining high reliability and scalability.
What problem does this paper attempt to address?