Entropic Approach for Reduction of Amino Acid Alphabets

Wei-Mou Zheng
DOI: https://doi.org/10.48550/arXiv.physics/0106074
2001-06-22
Biological Physics
Abstract:The primitive data for deducing the Miyazawa-Jernigan contact energy or BLOSUM score metrix are the pair frequency counts. Each amino acid corresponds to a distribution. Taking the Kullback-Leibler distance of two probability distributions as resemblance coefficient and relating cluster to mixed population, we perform cluster analysis of amino acids based on the frequecy counts data. Furthermore, Ward's clustering is also obtained by adopting the average score as an objective function. An ordinal cophenetic is introduced to compare results from different clustering methods.
What problem does this paper attempt to address?