Similarity Calculation Based on Feature Weight Evaluation
Ming LIU,Chong WU,Yuan-Chao LIU,Cheng-Jie SUN
DOI: https://doi.org/10.11897/SP.J.1016.2015.01420
2015-01-01
Abstract:Along with high-speed advance of information technology,the unsupervised character-istic of clustering makes itself an effective implement for data analysis.To acquire high clustering performance,the effective and precise similarity calculation plays a prime and necessary role for clustering algorithms.Owing to the fact that different features have diverse contributions to describe similarity among data,it is necessary to assess feature’s contribution by means of some transcendental knowledge (e.g.constrained data provided by users),and import it in similarity measurement to acquire more precise calculating results.Unfortunately,conventional weight evaluating methods all fail to consider two challenges:(1)high possibility of asymmetrical distri-bution of constrained data in feature space;(2 )high possibility of inconsistency contained by constrained data.Previous two issues disable conventional weight evaluating methods to acquire high precision,and even make them unable to work.Hence,this paper proposes a novel constraint based weight evaluating method to deal with them.For the former one,constrained data are partitioned into several equivalent classes,and distributing parameters are assigned to them to balance their distributions.For the latter one,constrained data are connected to form an undirected graph,and belief values are thereby computed to measure and reduce their possibilities to be inconsistent. Finally,these two parameters are integrated in weight evaluating function to form an accurate similarity measurement.Experimental results demonstrate that,this weight evaluating method can combine constrained data to obtain diverse contributions of different features to similarity calculation,and can be applied in any clustering algorithm to improve its precision.