Effects of Distance Information Between the Term and the Central Term on the Similar Question Matching

YAN Hongfei,CHEN Chong
DOI: https://doi.org/10.3321/j.issn:1000-0054.2005.09.032
2005-01-01
Abstract:It is an effective method to incorporate a frequently asked questions FAQ finder module in a QA system. The sixty-four-dollar question of a FAQ finder is how to match the user queries and questions of the FAQ corpus. This paper describes the design and the implementation of a FAQ finder system based on a FAQ corpus, including results from an evaluation of the system's performance against two distinct test collections. Experiments show that the computing method, which relies on the distance information between the term and the central term, gets a worse performance comparing with the method using the term frequency inverse document frequency TF-IDF method to compute the term weight. There are no erroneous matching results using either of the two methods with the threshold value of 0.5 for matching similarity questions.
What problem does this paper attempt to address?