Online community thread similarity measurement algorithm based on author analysis
Lei Wu,Chang Su,Yang Gao
2007-01-01
Journal of Computational Information Systems
Abstract:Document similarity measurement is the foundation of information retrieve, since researchers often treats document as basic information unit. When retrieving documents from online community, however, traditional similarity measurement algorithms based on key words often have complexity of a high degree. In this paper, we propose a new algorithm based on author analysis, which can reduce complexity magnitude from key words' quantity to users' quantity, while maintaining a good performance. We test our algorithm using data fetched from a real web site and compare it with the traditional method. The experimental results show that our algorithm reduces both the time and the space complexity, though depresses the accuracy a little.
What problem does this paper attempt to address?