Research on Characteristics of Chinese Herbal Medicine Compounds Based on Bisecting K-Means Algorithm.

Yushu Wu,Fenfen Xie,Lu Wang,Shoude Zhang,Lei Zhang,Xiaoying Wang
DOI: https://doi.org/10.3233/faia200713
2020-01-01
Abstract:The properties of Chinese Herbal Medicine (CHM) are determined to some extent by the properties of their molecular compounds, so it is of great significance to study CHM from the perspective of molecular compounds. In this paper, the clustering algorithm in data mining is used to study the relationship between the properties of CHM and its chemical components. Firstly, the molecular data are collected from the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform, and the data set is preprocessed to extract the key molecular descriptors of chemical components. Secondly, the k-means algorithm and the Bisecting k-means algorithm are used to cluster the chemical components based on the CHM molecular descriptors, and the representative molecular features of the cold and hot CHM are extracted; finally, through experimental comparison, it is found that the clustering results obtained by Bisecting k-means algorithm are better. The clustering results show that the average values of molecular composition descriptors and charge descriptors in cold CHM are significantly higher than those in hot CHM. Therefore, the properties of CHM may be affected by molecular structure and molecular charge properties.
What problem does this paper attempt to address?