Multi-label Feature Selection with Application to TCM State Identification

Liang Dai,Jia Zhang,Candong Li,Changen Zhou,Shaozi Li
DOI: https://doi.org/10.1002/cpe.4634
2018-01-01
Concurrency and Computation Practice and Experience
Abstract:SummaryThe goal of TCM state identification is to identify the patient's syndromes and locations and natures of diseases according to symptoms. Generally, symptoms of a patient are associated with several syndromes and multiple locations and natures of diseases; hence, the TCM state identification is a typical multi‐label problem. In this paper, a new method is proposed to predict syndromes and locations and natures of diseases according to the diagnostic information of TCM. In detail, the correlation between features and the correlation between class labels are combined into a new uniform feature space. After that, the MDMR algorithm is used to select the most discriminatory features from the new uniform feature space, which is helpful to reduce the data dimensionality. Lastly, a KNN‐like algorithm is modified to calculate the label similarity of test data, and the finite set of labels of test data is predicted by ML‐KNN. In this paper, the test data is collected by Fujian University of Traditional Chinese Medicine according to the theory of TCM and medical ethics. The experiments show that the performance of the proposed method is superior to some other popular methods and is helpful in the identification of health state in TCM.
What problem does this paper attempt to address?