Cross-Reading by Leveraging a Hybrid Index of Heterogeneous Information.

Shansong Yang,Weiming Lu,Baogang Wei
DOI: https://doi.org/10.1007/978-981-4585-18-7_42
2014-01-01
Abstract:In this paper, we present a novel application named Cross-reading, which is derived from user's reading process. Cross-reading is essentially a searching by document task from large-scale text corpus. The state-of-the-art approaches utilize similarity hashing to address this issue by modeling it as a high-dimensional data similarity search problem. However, most approaches only consider document's lexical information while ignoring documents semantic information and metadata. Moreover, searching similar hash codes from massive hash codes quickly is still a major bottleneck. To address those problems, we propose a Fast Searching By Document approach, which considers the Cross-reading from the perspective of semantic similarity and time efficiency. ? Springer Science+Business Media Singapore 2014.
What problem does this paper attempt to address?