Thresholding of Semantic Similarity Networks using a Spectral Graph Based Technique

Pietro Hiram Guzzi,Simone Truglia,Pierangelo Veltri,Mario Cannataro
DOI: https://doi.org/10.48550/arXiv.1305.4858
2013-05-21
Abstract:Semantic similarity measures (SSMs) refer to a set of algorithms used to quantify the similarity of two or more terms belonging to the same ontology. Ontology terms may be associated to concepts, for instance in computational biology gene and proteins are associated with terms of biological ontologies. Thus, SSMs may be used to quantify the similarity of genes and proteins starting from the comparison of the associated annotations. SSMs have been recently used to compare genes and proteins even on a system level scale. More recently some works have focused on the building and analysis of Semantic Similarity Networks (SSNs) i.e. weighted networks in which nodes represents genes or proteins while weighted edges represent the semantic similarity score among them. SSNs are quasi-complete networks, thus their analysis presents different challenges that should be addressed. For instance, the need for the introduction of reliable thresholds for the elimination of meaningless edges arises. Nevertheless, the use of global thresholding methods may produce the elimination of meaningful nodes, while the use of local thresholds may introduce biases. For these aims, we introduce a novel technique, based on spectral graph considerations and on a mixed global-local focus. The effectiveness of our technique is demonstrated by using markov clustering for the extraction of biological modules. We applied clustering to simplified networks demonstrating a considerable improvements with respect to the original ones.
Molecular Networks
What problem does this paper attempt to address?