A unified classifiability analysis framework based on meta-learner and its application in spectroscopic profiling data

Yinsheng Zhang,Zhengyong Zhang,Haiyan Wang
DOI: https://doi.org/10.1007/s10489-021-02810-8
IF: 5.3
2021-11-11
Applied Intelligence
Abstract:Spectroscopic profiling data (e.g., Raman spectroscopy and mass spectroscopy), combined with machine learning, have provided a data-driven approach for discriminative tasks. In these tasks, researchers often start with simple classification models. If one model doesn’t work, they will try more sophisticated models. If all models fail, the researchers will deem the data set as “inseparable.“ This “trial-and-error” practice reveals a fundamental question: does the dataset possess the necessary statistical power for the current discriminative task? This “classifiability analysis” is an implicit and often neglected step in the data-driven pipeline. This paper aims to design a unified methodological framework for classifiability analysis. In this framework, a meta-learner model combines diversified atom metrics (e.g., Bayes error rate / irreducible error, classification accuracy, information gain / mutual information) into one unified metric (d). We have successfully used the proposed framework to analyze a spectroscopic profiling dataset to discriminate vintage liquors of different ages. A significant difference (d = 1.447. d > 0.8 indicates a significant difference) between 5-year and 16-year liquors.
computer science, artificial intelligence
What problem does this paper attempt to address?