Finding Facet Content on Web by Position Inverted Index

Canghong Jin,Honglun Hou,Minghui Wu,Jing Ying
DOI: https://doi.org/10.1109/HPCC.2012.253
2012-01-01
Abstract:Entity facet can give the enhancement on search result since it can present web elements by multiple dimensions. Moreover, if web content is sorted by fixed dimension like term frequency, peculiarly data cannot be touched by user easily. Thus, how to extract and manage data facet is a significant work in web search area. Most of exist approaches find facets on web by manually defined annotation or cluster algorithm based on large corpus. These methods are very complex and need heavy resource. On the other hand, since inverted index structure is widely used on web search engine, in this paper, we propose a novel index structure called position index structure based on inverted index. By using this structure, we try to find a better solution to solve the facet extraction and peculiarly data find problems.
What problem does this paper attempt to address?