Bias of Phenotype Similarity Scores between Diseases

Jing Wang,Xianxiao Zhou,Jing Zhu,Zheng Guo
DOI: https://doi.org/10.1109/ICBBE.2010.5515892
2010-01-01
Abstract:Since diseases might be related with each other, systematically assessing their relationships could provide us novel insight into their mechanisms. One of the most important methods to study diseases' relationships is to calculate their phenotype similarity scores based on the text and clinical synopsis parts of their records in the OMIM database. However, as demonstrated in this paper, the similarity score between two diseases is highly dependent on the numbers of medical terms in the records describing the diseases (termed as record size). Because the descriptions of some diseases tend to be more detailed due to research biases, the similarity scores between these diseases tend to be larger. Thus, applications based on this phenotype similarity measure are problematic. In this paper, we also discuss some reasonable approaches to study the relationships between diseases, which may avoid the biased applications of disease similarity scores.
What problem does this paper attempt to address?