Learning similarity measures in s

Ning Liu,Benyu Zhang,Jun Yan,Qiang Yang,Shuicheng Yan,Zheng Chen,Fengshan Bai,M Wei-Ying
2004-01-01
Abstract:Many machine learning and data mining algorithms on the similarity metrics. The Cosine similarity, wh the inner product of two normalized feature vectors, most commonly used similarity measures. Howev practical tasks such as text categorization an clustering, the Cosine similarity is calculated assumption that the input space is an orthogonal usually could not be satisfied due to synonymy an Various algorithms such as Latent Semantic Indexin used to solve this problem by projecting the origina orthogonal space. However LSI also suffered fro computational cost and d ng Kong University of Science and Technology, qyang@cs.ust.hk cially rely calculates one of the I.5.3 [PATTERN RECOGNITI Applications –similarity measure in many document nder the ce which polysemy. LSI) were ta into an the high mings led ments for novel and pace. The f features . A novel General Terms Algorithms, Measurement.
What problem does this paper attempt to address?